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Preface 



This problem book covers all the traditional topics in modern statisti- 
cal theory and is designed for students at technical colleges and univer- 
sities who have mathematical statistics as an obligatory course. 

The problems are mostly analytical. The student is asked to prove 
the validity of an assertion or carry out an investigation. This will 
help him grasp the main aspects of mathematical statistics. Some of 
the problems are more difficult and can be used as individual assign- 
ments for course papers. 

We have included problems on computer simulation of random vari- 
ables in order to obtain the data foi statistical interpretation. Any 
"theoretical" problem which contains a statistical algorithm for data 
analysis can be used (with the appropriate (practically infinite) choice 
of the model parameters) to formulate a "'practical" problem. At the 
first stage the original data should be simulated using either published 
tables of random numbers or special computer programs. Then, by 
interpreting these "experimental" results according to the algorithm 
in question, the student can compare the theoretical hypothesis with 
the original parameters which are known as they were used when the 
sample was simulated. 

AH the problems differ in complexity. More difficult problems are 
marked with an asterisk and may require a significant effort on the 
part of the reader. Problems that cannot be reduced to standard al- 
gorithms are answered in detail or hints are given. 

Each chapter contains the basic notions, assertions, and formulas 
from the respective theoretical section. The statistical tables at the end 
of the book will help the reader obtain numerical results. The list of 
distributions will help htm choose problems on different aspects of 
the same model. 

The Authors 



THEORY AND PROBLEMS 

CHAPTER 1 

Principles of Statistical Description. 

Sampling Characteristics 
and Their Distributions 



1.1. Problems in mathematical statistics are based on statistical data 
obtained by observations on a finite set of random variables X = 
(Xi, . . ., X„) which describe the outcome of an experiment. We say 
that the experiment consists of n trials, where the ith trial results in 
a random variable Xi,i = 1, ..-...». A set of observable random varia- 
bles X = (Xi , . . . , X„) is called a sample, the values Xt, i — 1, ...» 
n, are called the elements (units) of a sample, and the number n is 
called the sample size. A set i^"= [x = (xt , . . . , x„) ] of all possible 
realizations of the sample X = {X\ , .... X n ) is called a sample space. 
When the true distribution of X (the distribution function 

F x Ori , ...,*„) = P(Xt ^x , X„ ^ x n )) is unknown (completely 

or partially) and only the class (family) of admissible distributions 
, r s~= \F(x , x„) ) which contains the distribution F x of the sam- 
ple X is specified, then we have a statistical model (.-if, £0 (or simply 
model 'S"). Mathematical statistics reveals (within a given model ■9 r ) 
the properties of the true distribution F x using the results of obser- 
vations on the sample X. 

Some experiments consist of repeated independent observations on 
a random variable £ (with the distribution ,^({)). Then the sample 
X — (Xi, . . ., X H ) is a set of independent similarly distributed random 

variables, where J\Xi) — SQ). ' = 1 n. To be concise, we say 

that X = (ATi X„) is a sample from the distribution S(&. The 

Statistical model for repeated independent observations is written as 
'/~= \Fi\, i.e., we only indicate the class of admissible distribution 
functions of the original random variable £. 

If - r y~- \F(x, 0), 6$ 6], ije., the admissible distribution functions 
are defined up to a parameter d, then the model is said to be parametric, 
and the set G of the possible values of 6 is called a parametric set. 

We will only consider absolutely continuous or discrete models and 
use/ ( (jc) = f{x) (J(x; 6) for parametric models) to denote the distribu- 
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tion density of the random variable £ if the distribution F f is absolutely 
continuous, and the probability P(£ = x) if it is discrete. 

In the case of a parametric model the distribution of probabilities 
on a sample space f which corresponds to the parameter 6 is denoted 
Ps. Similarly, EeT(X), Ds7"(X) are used to denote the moments of 
a given function T(X) of the sample X when F x (x; fl) is the distribu- 
tion function of the sample. 

1.2. Many problems in mathematical statistics concern sequences 
of random variables ( i) n 1 which converge to a limit tj (a random varia- 
ble or a constant) as n -* «>. We will use two forms of convergence, 
i.e., convergence in probability (*)„ -* ij (* P(|»?n — jj| > fi) -» 
Vs > 0) and convergence in distribution, or weak convergence 
(-*%*) - -^(i>) or i}„ 4 t, •» F*,(x) - />(*) Vx e C(FJ, where C{F) is 
the set of points of continuity of the function F(x)). Note that the 
P-co nve rge nee implies the -^convergence. The inference on the P- 
convergence of various sampling characteristics often follows from the 
general assertion on the convergence of functions of random variables 

[7, p. 27], i.e>, ifn„< -» a = const, i = l r, and ?>(-*i> . ■ , x r ) is 

an arbitrary /unction continuous in the neighbourhood of the point 
(c c,), then <p(tj„,, . . ., jj nr ) — <p(d, . .» c r ). 

13. If X = (Xi, . , ., Xn) is a sample from a distribution _j^(£). then 
F ( (x) = Fix) is called a theoretical distribution function, and 



""-iS 



n ■ * 



is an empirical distribution function (here ftn(Jt) is the number of ele- 
ments In a sample, which satisfy the condition Xj < -r, and /(A) is 
the indicator of the event A). 

By the Bernoulli theorem, F„(x} ** F(x) vx as n -» », i.e., for large 
rt the value of />(*) can be an estimate for F(x). The Glivenko and 
Kolmogprov theorems on the asymptotic properties of F„(x) for large 
n [7, p, 22] prove that the empirical distribution function can be an 
estimator for the theoretical distribution function. 

If a random variable 2 is discrete and assumes the values a\. 
a%, . . . , then the distribution law for £ may be conveniently represent- 
ed by the frequencies h r /n, where h r is the number of units in a sample, 
which are equal to a r . Then h r /n -* P(£ = a r ), r = 1, 2, . . . , as n -* oo. 

If the values of £ have the density /$(*) = fix), we may investigate 
the frequencies hk/n of the events { £ e A* ) , where { A* | is a system 
of n on intersecting intervals into which the region of the possible £- 
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values is divided. Then 



^ * P(£ e A*) = f /(*) dx 



as n -* oo, and, if A* are small, we may use the frequencies hn/rt to 
construct a histogram and a frequency polygon which are close to the 
graph of the function /(jt) [7, p. 23] and give an approximate form 
of the distribution of £. 

Every theoretical characteristic g = SsW riFOO corresponds to its 
statistical analogue (copy) 

n 

C = G(X) = \g(x)dF„(x) = - 2 8 <^> 
J " i- 1 

which is called the empirical or sampling characteristic. Specifically, 
sampling moments arc statistical analogues for theoretical moments. 
The quantity 






■4 nlt = -4«*(X) 

n 

is a sampling moment ofkth order^Al k — 1 the quantity A„\ is called 
a sample mean and is denoted X, viz., 

i- 1 

The quantity 

I- I 

is called the central sampling moment ofkth order. At k = 2 the quan- 
tity Mii is called the sample variance and is denoted 5 2 — S^X), viz., 



s 3 CK)=-5]yrj- 3o» 



i = i 



The notation S' 2 = ~ S 1 may also be used. The absolute sam- 

n — 1 



z „ = f (l 
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pling moments, sampling semi -invariants, etc., are introduced in a 
similar way. 

Sample quant iles are another example of sampling characteristics. 
A p-quantile for any distribution function F(x) is defined as 
t P = inf tx-. F(x) > p\, < p < 1, and a sample p-quantile Z<t, p is 
a p-quantile of the empirical distribution function F*{x). If the units 
of the sample X = (X t , ■ . ., X„) are arranged in increasing order of 
magnitude, we get a new sequence of random variables 

X(i> < X(2) < . . sS X(„) 

which is called an ordered series of a sample. Here X m is the kth-order 
statistic, k - 1, . . . , n, and X^, and X ( „j are the extrema of the sam- 
ple. Then we can express Z a , P through order statistics 

Y(\w) * i) f° r non-integer np. 
Kin?) for integer np. 

Specifically, Z„,i/2 is a sample median. 

Any sampling characteristic which is a conlinuous function of a 
finite number of the values A„ic (in particular, Ihe sampling moments 
and central sampling moments M* k ) converges in probability to the 
respective theoretical characteristic as n -> oo and can be an estimator 
for the latter when the number n of observations is sufficiently large. 
Similarly, Z„ rP -* £ p if only the distribution -AE) has a smooth 
density. 

M, The sampling theory studies various properties of the distribu- 
tion of sampling characteristics in exact and asymptotic (for large sam- 
ple sizes) forms. When investigating the asymptotic behaviour (as 
n ■** oo ) of distributions, the limit theorems of probability theory (spe- 
cifically, the law of large numbers aiid the Central Limit Theorem) 
are frequently used. We take their simplest forms from [2]. 

The law of large numbers- If the random variables iji, 72, . . ., i? n 
are independent, similarly distributed, and their expected values are 
Ejji = a, then as n -* 00 

— (i)| + ... + T)n>~* a. 
n 

The Central Limit Theorem, If in addition to the above conditions 
there exists Diji = o* > 0, then as n — * 00 

S((vi + ■■■ + V- - na)/ '(vno)) - ^T(0, I). 

A multi-dimensional version of the Central Limit Theorem has the 
form: let the r-dimensianal random vectors ?„ = (ij„i, . . ,,Tj nr ),n = 1, 
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2, ..., be independent, similarly distributed, and have finite moments 

at ■ Etju, *</ = cov (tjn, jjy), (, y = 1, ..., r. 
Then as n -* <=° 

^"(i»i, -.., W-^0, B= fell), 

iVi = Crii + • ■ ■ + V<" - nai)/*fh~, t m 1 r. 

(The definition of a multi-dimensional (multivariate) normal distribu- 
tion see in Sec 1.6.) 

Let us formulate some assertions on the convergence of the func- 
tions of random variables, which we will need to solve problems. 

I* V 7?n ~* V ana " the /unction <p is continuous, then tp(n4 ^Vdf). 

2°. Let {7t„, f n J, n - 1, 2 te * sequence of pairs of random 

variables. Then 

(a) r,„ - f B ' 0. r-^f - W^lJ 

(b) jr%J - _*%), f» - =» Vn [ A * 0; 

(c) jf(r,„) - _*%,), f« f-S c - const ** V(t)„ + f J — -Aii + c), 
-^C,) -» jfCeO, 4WM ~ .Ai/c) /or e * 0; 

(d) ^k - !*» •** 0, -^(fn) '-» -^(j*). the function v is continuous 

» ¥>(>!..) - IP(J'n) » 0. 

3°. i^f r„ = T„(X), X = (Jfi , X„), &r the estimator of a scalar 

parameter 9 in the model 3^= \F(x; ff), 6©t such that 
-a(fn{T a - <?)) «• ./f(0, 2 {61f) as n— » and for alt 6 <E 0. Suppose 
also that the function <p is differentiable and <p' ^ 0. Then 

-4(M<p<T») - vim --^(0. br'WWCBX 

Besides, if the functions <p ' and er are continuous, then 
v(T„) - ?W \ 
v'{T„)a(.Tn) 
The generalization of I* to the case of a vector parameter $ - 

(& 8 r ) has the following form. 

4°. LetT* = (7"ni T„ r ) be an estimator of the parameter 9 satis* 

fying the condition ^(Vh(T„ - 9)) - ^(0, £(fl» as n -* °° for all 
e 9. Then for any differentiable function y> of r variables we have 

JtlM*<JJ - *>(«)) - ,^<0, u 2 (#)> 
under the condition that u(0) ?* 0, wAe/e u*(S) = b'(e)E(0)b(e), 
b(tf) = f — t- , .,,,-i], Moreover, if the function if is continuously 



^ (v? w i"r'M->.^), i). 

T„) / 
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differentiable and all (he elements of the matrix of the second mo- 
ments 2(9) are continuous in 8, then 

-^C^[t»(T n > - vWV v (T„)) ->.4'{0, I). 
By the Central Limit Theorem the sampling moment A nk is asymp- 
totically normal and its parameters are a k = Ef* and — D|* = 

n 

(<k* — <*!)/», which may be briefly written as ^(A**) ~ 
■4^{a k , (eta — al)/n). The joint distribution of any finite number of 
sampling moments /1, t is also asymptotically normal, as well as (un- 
der some additional conditions) the distribution of any differentiable 
function of a finite number of moments A„ k . Specifically, central sam- 
pling moments M Hk are also asymptotically normal. 

We use direct analysis of exact distributions of order statistics X( k ) 
to investigate the asymptotic behaviour of X^y as n -» «>. For the dis- 
tributions Stt) with smooth densities the mid terms of the ordered 
series (i.e., when the number k = k{n) satisfies the condition k/n -* p, 
< p < I) are asymptotically normal, while for the extreme order 
statistics (i.e., for X<_ ri , \\„ , , ,. i , at fixed r, s ^ I) the class of limiting 
distributions only consists of three types of distributions, which are 
not normal [7, p. 35], 

1.5. Some formulas of probability theory, which are used to obtain 
an explicit form of a distribution when transforming the random vari- 
ables, are appropriate here. Let the vector X = (Xi , . . ., X k ) have 
an absolutely continuous distribution with the density f{%), x = 
1*1, x k ) £i£Jt» ajid let ft = {hi, . . ., h k ): S-+ R k be an ar- 
bitrary, one-to-one, and smooth (i.e., all its partial derivatives 
dh<(*Vdx/ are continuous) transformation whose Jacobian 

M,(x) Bh k {x) 



J{X) «= det 



dXl 



dx, 



dh,(x) dhkix) 



dxk 



dxk 



does not vanish on S. Then the distribution density of the random 
vector Y = n(X) a (A,(X), .... A*(X)> has the form 

s»<y) - JQT * (y))/l^(» " ' 0»| , y = to* , ■ - ■ . >*) « h{S), (1.2) 

where h" 1 is a transformation inverse to h, !,&, h" J (h(x)) ■ x. Two 
special cases are frequently encountered. If k = I, then we have to 
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transform the random variable Y = h(X), where ft(jr) is a one-to-one 
smooth function with a non-vanishing derivative. Then the distribu- 
tion density of I' has (he form 

vW "Ah** 0»/|A '{h- 1 o»|. (l .3) 

If we have a linear transformation V ■ AX + b, det A e a y* 0, then 
the distribution density of Y is p(y) = /(A" '(y - b))/|»|. 

Some statistical problems deal with the ratio £ = £/7j of two in- 
dependent random variables whose distribution densities ft and/, are 
known. The distribution density of f can be found from the formula 

/rW = J /tW/,WM dx. (1.4) 

1,6. We will need some frequently applied distributions -/"(£) and 
their properties. 

(1) The normal distribution A'\p., tr 1 ), - co < p < tx>, a 1 > 0, has 

the density —==- e 2al , -co < x < <x>. Here y. - E£, v 2 = D£, 
V2ircr 

and the central moments /j* = E(£ — jt)* are /tzr-n =0, ni r = 

^2i cr 2 ' = 1 x 3 . . . <2r - l)b Jr , respectively. The distribution ^'(O, )} 
r\2 r 

is called a standard normal distribution; its distribution function 

is *(» = \e~' VJ rfr, the equation *(uj,) = p, /J € (0, 1), 

V2ir J 
— <*> 

uniquely defines its .p-quantile u,, with Ui^ p = -u p . The notation 
g, = W[l+,)/i can also be found in the literature. The random vector 

£ = (£ x fit) has a it-variate normal distribution ^(n — 

On, . . ., p*), £ = |cry|[) if its characteristic function is of the form* 



Ee"' 1 = exp X'n'ti - -t'£t|, 



t = (*,, ..., /*). 
Here 

E«)««E&, .... K&)-« 

D(f> ■ E({ - „)« - p.)' m |cov «,. 6)|* = |<r tf |* = EL 



* In matrix operations vectors are treated as column-vectors and ' stands 
for a transposition. 
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If |C| & 0, the distribution Jf'dt, E) is non-degenerate and has the 
density 

x = (xi x k )€R k . 

The normal distribution has an important property that under a 
linear transformation 17 = A£ (A is a given matrix) we obtain a normal 
random vector and -j^ij) = --^(Ap, AEA'). Specifically, if q = U'{, 
where U is an orthogonal matrix which reduces E to a diagonal form 

U'EU = D = ' -. (X/, j = I, -.»*, are the eigenvalues of 

I ° '** H 
E), then y\if) - *4TU ' p, D), i.e., the components of the vector i> are 
non-correlated and therefore independent. Putting Z = D" l/1 U'(f — 
ft) (ir all the \j > 0), we get ■/'(Z) = ^(0, E*). where E* is an identity 
matrix of dimension k. Thus, we can always find a linear transforma- 
tion to turn a non-degenerate normal vector into a vector with in- 
dependent standard normal components. 

When applying samples from normal distributions, we need the fol- 
lowing important assertions (7, pp. 38-40]. 

1". Jf X = (Xi X„) is a sample from the distribution 

-<f'(ii, a*) and l - BX, Qt = X'A;X, i — I, 2, are, respectively, linear 
and quadratic functions o/X, then it is sufficient that BA ( = for 
t and Qi to be independent, and AsAj = A2A1 = for Qi and Qi 
to be independent, 

2°. Let (i = 0, a 2, = 1, and A? = Ai (the matrix Aj is idempoteni). 
Then -if(Q{) = x J 0), where r = rank Ai = tr Aj is the trace of the 
matrix Aj. 

3° Fisher's theorem. The sample mean X and variance S* are in- 
dependent and ^Vn(X - p.)/a) = >(0, 1), VtitSVn*) = x*{fl - D- 
(The definition of the x 2, -distribution will be given below.) 

(2) The gamma distribution T(c, X), a, X > 0, is defined by the den- 
\-r e -r/u / » 

sity j— , x > 1 here r(X) = W x " ' e ' dt, X > 0, is a gamma 

function} . and its moments areEf* = <r*T(X + 6)/T(\), * > -X- In 



■) 



particular, E£ — aX, D£ = ff 3 X. 

The special case T(a, 1) is called an exponential distribution. 
Another special case is V(2, n/Z), It is called the chi-square distribu- 
tion with n degrees of freedom and is denoted x 2 (n). Here x 2 («) =■ 
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Here 



E£ = a + *r 



-^X(i + - - . + ii), the terms' are independent, and -j^(&) = .1(0, I), 
i = 1, ...,». We use xp. « to denote the p-quantiles of the distribution 
X 2 W- 

(3) In the general case, Weibull's distribution W(a, a, b) depends 
on three parameters, i.e., the location (position) parameter a e J? ' , the 
shape parameter a > 0, and the scale parameter b > 0, and is defined 
by the distribution function 

Ftfc) - l - «<P f - (^-^)°] - * >a. 

D|.».[r(i»i)-r'(.*l)]. 

The special case W(a, 1, fc) is known asa two-parameter exponential 
distribution, and the case If (a, 2, fe> is known as Rayleigh's distri- 
bution, 

(4) The beta distribution B(a, b), a, b > 0, is defined by the density 

.v'-'O - x)"- VB(o, 6J, < jt ^ l, where Bfo b) = r(g)r(&) is a 

V(a 4- b) 
beta /unction. Here 



o + * (a + 6) J (« + b + 1) " 

(5) The uniform distribution R(a, b), —<*><a<b<to, has a 

constant density /(*) = r , a sg x ^ b. Here 

b — a 

2 12 ' 

(6) Cauchy's distribution C(a), ~- => < a < oo, is defined by the 

density , — oo < x < «J. This distribution has no mo- 

■x \ + (x - ti) z 
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merits (including the mathematical expectation), and the constant a 
coincides with the median $uz- Cauchy's distribution has an impor- 
tant property that if the random variables fi, . . ., £„ are independent 
and -/*(&) = C(ai), i = I, ..., n, then ~f(& = C(c), where the bar 
denotes an arithmetic mean. 

(7) Student's distribution S(n) => ^Un H VVx-/n) with n degrees 
of freedom, where n and xl are independent random variables and 
_S( v y = ^(0, 1), S{xl) = X 2 ("), has a density of the form 



, r (y) 



virn p /n 1 



We use tp.n to denote its /j-quantiles. 

(xi! x*\ 
/II If*/ 
with fli and «^ degrees of freedom, where Xn, and xj^ are independent 
random variables, and -^(x«,) = x*(ni), « = 1> 2, has a density of the 
form 

„ / «i + m \ 
/ ni w> V 2 ) y^- 1 

w r (¥) r (*)('♦ 2*) 

We use Fjj.ni.Hj to denote its .p-quantile, where 'i- »«■.■> = 

(9) The binomial distribution Bi(n, p) is a distribution of the num- 
ber of successes in n independent trials with two outcomes (success- 
failure) and a constant probability of success pi (0, 1) (the Bernoulli 

trials). Here 

P({»*)- C*uV*. k = 0, 1, ...,n,q = 1 -p. 
E$ = np, Df = npq. 

For n = 1 we have Bernoulli's distribution Bi(\, p). 

(10) The polynomial distribution M(n; p t , . . ., p/v), Pi 4- ... + 

pur = 1, is a distribution of a random vector f = (wi jw) with 

non-negative integer- valued components satisfying the condition 
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ci + . . . + cjv = n which has the form 

h = (A,, . . ., Ajv>, hi + . . . + An = /I. 
Here 

If we carry out n independent: trials with JV possible outcomes whose 
probabilities do not vary and are equal to p t , . . ., ps, respectively, 
then, by using v, to denote the number of the realizations of the ith 
outcome, / = 1, . . ., N, we will obtain -/*()>) — M(n\ p it . . . , ppr). if 
N = 2, we have M(n; p. I ~ p) = 8i(n, p), i.e., the polynomial distri- 
bution is reduced to the binomial one. 

(11) Poisson's distribution II (X), X > 0, is defined by the proba- 
bilities 

P« = k) = e;" x |j-, * = 0, I. 2, .... 

Here X = E£ = Df and, in general, E({)j = X-', where {a)j — 
eifl - 1). . .{a -J+l),J2U Wo = 1. __ 

(12) The negative binomial distribution Bi(r, p), p £ (0, 1), r = 1, 
2, . . ., is defined by (he probabilities 

P(E = k) = C* t *~ ,/>V. k = 0, 1, 2 «? - 1 - p. 

This is a distribution of the number of successes before the nh failure 
in an infinite sequence of Bernoulli trials. Here 

E$ = rp/q, D£ = rp/q 2 . 

In the special case of r = 1, the Bi(l, />) distribution is called a geo- 
metric disribution. 

(13) The hypergeometric distribution H(r, N, n) is defined by the 
probabilities 

P(£ = k) = C*CJ,_* /C&, 

max (0, n + r-W)<*^ min (n, r). 

If an urn contains N balls r of which are red and N - r are black, 
and we withdraw from it without replacement a random sample of 
size n, then the random variable £ (the number of red balls in the 

2— 88^ 
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sample) has ^ hypergeometric distribution. Here 

E* - * , Df = 2. (i - L\ N ~ n 

' N ' ' N \ N) N - 1 

and, in general, E(£)y = — . 
(AO; 

Other properties of these distributions are considered in Problems 
1.39-55. 

If a siatisiical model 3^= {Frf is defined by a standard distribution 
with unknown parameters 6 (if there are a few parameters, only some 
of them may be unknown), the model preserves the name of the distri- 
bution. For example, the model ^{9, a 1 ) is said to be normal with 
an unknown mean, the model ^(ji, 6 2 ) is normal with only variance 
as the unknown parameter, the model sT{9\ , s\) is a general normal 
model with two unknown parameters, the mode] n($ is Poisson's 
model. 

1.7. Statistical simulation using a sequence of pseudo-random num- 
bers helps to illustrate the efficiency of various statistical procedures. 
Pseudo-random numbers are sequences of numbers obtained by a cer- 
tain algorithm and having the properties of a sequence of random 
numbers. The methods of obtaining pseudo- random numbers can be 
found in [4, 9]. 

A realization of a sequence of arbitrarily distributed independent 
random numbers is commonly obtained from a realization of a se- 
quence of independent random numbers uniformly distributed on a 
segment [0, I]. 

A realization of the uniformly distributed random numbers 

ifo, Ui, Oi, ... <1.5) 

is frequently obtained by the linear congruent method [9] 

U„ = tjm, (1.6) 

where z» is a sequence defined by the recurrence relation 

Zn + ( = azn + c (mod m). 

where za is the initial value, and a, c, and m are positive integers. 
Strictly speaking, the sequence (1.5) defined by (1.6) cannot be treat- 
ed as a realization of an independent sequence of uniformly distribut- 
ed numbers, because it is either periodic or periodic with a lead 
sequence. The length of the period T is less than m, because the num- 
ber of different values of Zn> n = 0, 1,2 does not exceed m. 

It is obvious that the sequences which exceed the terms before the peri- 
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od plus the length of the period should not be used. Nevertheless, 
the sequence (1.5) can have the greatest possible period m when the 
constants a, c, m, and zx> are chosen properly. 

The following theorem defines the conditions for the period of a 
sequence to be maximal 19], 

Theorem. The length of the period of a linear congruent sequence 
(1.5) is equal to m if and only if 

(1) c and m are relatively prime numbers', 

(2) b = a — 1 fe divisible, by P for any prime p which is a divisor 
of m; 

(3) b is divisible by 4 if m is divisible by 4, 

The presence of a complete period does not always ensure good 
properties of pseudo-random numbers. Even the commonly used 
generators have essential drawbacks. "Various statistical tests [6] help 
to verify the "quality" of the sequences generated. It is usually enough 
to check whether the 5- chains (s = 1, 2, .. .) of the sequence (1.5) 
are uniformly distributed, and then use this sequence to solve simula- 
tion problems. 

Let us simulate n ■= 100 uniformly distributed numbers Xi , Xi, .... 
X„ and list the results En Table 1.1. 
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0.16B 


0.273 


0-878 


0.983 


0.5S8 


0.693 


0.298 


0.403 


0.008 


0.113 


718 


823 


428 


533 


138 


243 


848 


953 


558 


663 


549 


754 


459 


664 


369 


574 


279 


484 


189 


394 


099 


304 


009 


214 


919 


124 


829 


034 


739 


944 


550 


855 


660 


965 


770 


075 


880 


185 


990 


295 


100 


405 


210 


515 


320 


625 


430 


735 


540 


845 


571 


976 


881 


286 


191 


596 


501 


906 


311 


216 


121 


526 


431 


836 


741 


146 


051 


456 


361 


766 


012 


517 


522 


027 


032 


537 


542 


047 


052 


557 


562 


067 


072 


577 


562 


087 


092 


597 


602 


107 



Figure 1 shows the empirical distribution function F n {x) constructed 
from these data. 

We now obtain n normally distributed random numbers Xu 
Xi, . ...Jf„ with the parameters^ ~ EA), a 2 = f>Xi. The histrograms 
for (i = 1, a 2 - 4, and n = 10, 100, 1000 are plotted in Figs. 2-4. 

We divide the .ay-axis into intervals of length h, where h = 3, 1.5, 
0.75 for n = 10, 100, 1000, respectively. The boundary point of the 
intervals is x = 1 for any n. 
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Table 1.2 gives the estimates 

ft n 

w *—* /i - 1 - £ — ' 

I- I '- I 

for the parameters ft = 1 and a 1 — 4. 
Table 1.2 



n 


[0 


1 00 


1000 


X 


0.676 


1. 016 


0.9B8 


S' 2 


3.90I 


4.315 


4.306 



Problems 

I.l. Suggest a method to simulate a sequence of Bernoulli trials X\, 
X 2 , ..., X n , ..., where P(A*, = 1) = 1 - P(X, = 0> = p. 

(//«»f. Use a sequence or pseudo-random numbers uniformly dis- 
tributed on the segment [0, 1], 
1.2. Simulate a sequence of Bernoulli trials as in Problem 1.1, Where 
p = 0.4 and n = 1000. Calculate the frequencies p k /k, where 
ii k - X, + . . . + Xn, for * = 100, 200, . . ., 900. 1000. Construct a 
graph in the *y-plane by connecting the neighbouring points 
{*, nt/k), k = 100. 200, ,.., 1000, by straight lines, 

I J. Find a way to simulate independent trials in a polynomial 
scheme with the outcomes I, 2, . . ., N whose probabilities are p\, 
P%> ■ ■ -, Pn, respectively. 

1.4. Find a way to simulate a discrete-time symmetric wandering 
through integer points on a straight line with origin ai the point 
(the probabilities of transitions to neighbouring points in a single step 
are taken to be the same). 

1.5. Let a random variable £ be uniformly distributed on the interval 
[0, I], and tel F(x) be a continuous distribution function. Find the 
distribution function of the random variable ti = F~'{B, where 
x ~ F~ '(y) is a function inverse to > = F(x). 

1.<S. Suggest a simulation technique for a random sequence X t , 

X 2 , , , ., X n , where f>(X„ =£ /) = 1 - e ~' / " t i > {a > is a 

constant). 

\Hinl. Use the previous problem. 
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1.7. Simulate independent and exponentially distributed quantities 
Xt, Xi, . . ., X a with a = 1 and n = 100. Construct an empirical dis- 
tribution function and a histogram. Calculate the first and second 
sampling moments A„% and A&. 

\Hint. Use the previous problem. 

1.8. Suggest a simulation technique for an Erlang random sequence 
\Xj ] with the parameters (a. m) (i.e., J (X/) = l\a, m), / = 1, 2, ■ . .). 

1.9. Using the Central Limit Theorem, find a way to simulate ap- 
proximately normally distributed random numbers X„ t n - 1, 2 

1.10. Let Xh Xn, be the realization of a sequence of approx- 
imately normally distributed numbers each of which is obtained by 
summing up /Vuniformty distributed terms (see the previous problem). 
Get three realizations (for jV = 2. 4, 12) of the samples with n — 100, 
a = 0, and o 2 — 1. Construct the empirical distribution functions and 
histograms for each sample. Find the estimates for a and a 2 . 

1.11. Using the samples from the previous problem, calculate the 
third and fourth central sampling moments and compare them with 
the true values of the theoretical moments. 

1.12. Suggest a simulation technique for a sample from a binomial 
distribution Bi(k, p), 

1.13. Let v„ be the number of successes in n Bernoulli trials with 
the probability of success p € (0, 1). Under the condition that n is large 

calculate the boundary &, such that the event 



„- p 



< &, has the 



probability = v. Check whether the results of the following 
(De Buffon's) experiment lie within these boundaries for y = 0.98, 
viz., heads appeared It = 2048 times at n = 4040 tossings of a coin, 

I Hint. Apply the De Moivre-Laplace theorem and consider the 

| coin to be symmetric 

1.14. Using the approach of the previous problem, check whether 
the following data correspond to the theory, viz., among the 
« = 10 000 randomly placed numbers 0, 1 . .... 9, those not exceeding 
4 appeared h - 5089 times. 

1.15. Simulate a sample of size « = 1000 from Bernoulli's distribu- 
tion Bi(l, 3/5) and check, as in Problem 1.13, whether the experimen- 
tal data correspond to the theoretical prediction. 

\Hint. Use Problem 1.1. 

1.16. Suppose that an experiment consists of tossing 12 dice. The 
observable random variable £ is equal to the number of dice with a 
4, a 5, or a 6, Let hi be the number of trials in which the values £ = i, 
i = 0, I, . . ., 12, were observed. The data for n =■ 4096 trials are given 
|8] in the following table: 
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1 


2 


3 


4 5 6 7 8 9 10 II 12 


Tt)<al 


ft, 


U 


7 


60 


via 


430 731 948 847 536 257 71 II 


n o 4096 



(a) Construe! the frequency graph hjn and compare it with the 
graph of the function c*e~ lVI . 

(h) Compute the sample mean and variance, the skewness coeffi- 
cient, and kurtosis. 

(e) Assume that ^(S) = Bi02, 1/2) and find & from the condition 
P(|A" - nil ^ 5) = 0.998. Compares with the deviation of the sample 
mean from the theoretical m as calculated from the given data. 
I Hint. When estimating the probability in (c), use the theorem on 
f the asymptotic normality of a sample mean. 

1.17. (Continued from Problem 1.16.) Let the random variable £ of 
the previous experiment be equal to the number of dice with a 6, The 
observed data are tabulated (8) as 



1 2 3 4 5 6 3=7 Total 



h, 447 1145 1181 796 380 115 24 S it = 4096 



Answer Ihe questions of Problem 1.16 if jf(£) = 8/(12, 1/6). 

1.18. Simulate a sample of size n - 1000 from the distribution 
^(f) = Bi(A, 1/3) and analyze ihe obtained data as in Problem 1.16. 

\Htnt. Use Problem 1.12, 

1.19. Suppose that we observe 500 randomly chosen watches in shop 
windows. Let i be the number of the interval between the f'th and 

{i + l)!h hours, i-O, I II, and lei hi be the number of watches 

indicating the z'th interval. The observation results are grouped [3] in 
the following table: 

' <> I 2 3 4 5 6 7 8 9 10 II Total 



hi 41 34 54- 39 49 45 41 33 37 41 47 J9 n = 500 



(a) Construct a frequency polygon and compare it with Ihe plot of 
i he function f(x) = c, ^ x c 12, 
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(b) Assume that these data are independent observations on a dis- 
crete random variable f whose values coincide with the middle points 
of the respective intervals (i.e., 0.5, 1,5, . . - , 11.5) and calculate the 
sample mean and variance. 

(c) Assuming that the random variable £_ from (b) has a uniform 
distribution, find 5 from the condition P(\X - a\[ ^ S) = 0.98 and 
compare it with the observed deviation \X — ai|. 

1.20. Simulate a sample from the polynomial distribution 
M(500; 1/5, 1/5, 1/5, 1/5, 1/5) and, assuming that these data are obser- 
vations on a random variable f having the values - 2, -1, 0, 1, 2, 
analyze the respective data as it was done in problem 1.19. 

\ffint. Use Problem 1.3. 

1.21, Suppose that we observed a non- negative continuous random 
variable £. Its values (rounded to 0.01 and placed in the order of mag- 
nitude) for n = 50 trials were 0.01, 0.01, 0.04, 0.17. 0.18, 0.22, 0.22, 
0.25, 0,25, 0.29, 0.42, 0.46, 0,47, 0.47, 0.56, 0.59, 0.67, 0.68, 0.70, 0.72, 
0.76, 0.78. 0.83, 0.85, 0,87, 0.93, 1.00, 1,01, 1.0 J, 1.02, 1.03, 1.05, 1.32, 
1.34, 1.37, 1.47, 1.50, 1.52. 1.54, 1.59, 1,71, 1.90, 2.10, 2.35, 2.46, 2.46. 
2.50, 3.73, 4.07, 6.03. Construct an empirical distribution function and 
a histogram. Compare the histogram with the graph of the function 
cc'"". x > 0. Compute the sample mean and variance. 

1.22, Suppose that a sample of size n - 100 was 0.144, 0.937, 1,787, 
-1.0S2, -0.192,0.169,2.623,2.135, 1.759,0.811,0.724, -0.110, 1.752, 
-0.378. 0.417, 1.360, 1,365, 2.587, 1.621, 2.344, 1.379, 0.560, 1,858, 
2.453, -0.356, 1.503, -0.134,2.950, -0.816,0,717.2.468, 1.131,1.047, 
1.355, 1.162, -0.491, 0.261, -0.183, 0.467. 0.502, -0.805. 0.228, 
2.286, 0.364, -0.312, -0.045. 2.559. 0,129, 0.898, 0.877, 3.285, 1.554, 
1.418, 0.423, -0.489, -0.255, 1.092, 0.402, -0.051, 0.020, 0.398. 
1.399, 2.121, -0.026, 1.087, 2.018, -0.437. 1.661, 1.091, 0.363, 1.229, 
0.416, 1.705, 1.124, 1.341, 2.320, 0.176, -0.541. 0.837, 3.329, 2.382. 
-0.454, 2.537, -0.299, 1.363. 0.644, 0.975, 1.294. 3.194, 0.605. 1.978. 
1.109,2434. -0.094.0,735,0.143, -0.421, -0.773, 1.570, 0.947. Con- 
struct an empirical distribution function and a histogram. Calculate 
the sample mean and variance, the skewness coefficient, and kurtosis. 

1.23. Let a-particles be radiated by a radioactive substance during 
7.5 seconds. Suppose that the following data were obtained in 
n = 2608 experiments (A, is the number of trials for which the number 
of particles £ = /, i = 0, 1, . . .), i.e., 



i 1 2 3 4 5 6 


7 8 9 10 II £12 Toial 


h, 57 203 383 525 J32 408 273 


139 45 27 10 4 2 n = 2608 
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Construct the frequency graph h,/n and calculate the sample mean 
and variance [5J. 

In the problems below X = iX,, ■ ■ ■, X n ) is a sample from a distri- 
bution _r"(£), and Fix) and F n (x) are the theoretical and empirical dis- 
tribution functions, respectively (see {I. I)). 

1.24. Given a point xo, such thai < F(xa) < 1, and a number t, 
estimate the probability of the event 

\Fn(x) - Fixa)\ £ t/Vn 
for large n. 

[Hint. Use the De Moivre- Laplace theorem. 

1.25. Lei x, < xi be two given points on a number line, such that 
< Fix,) sj F(x 2 ) < 1. Prove that 

cov (F n (x,), F»(xi)> = - F(x,)(l - Fixi)). 

ft 

Hint. Represent the random variables ^(jtr,) and A„ix, t jr 2 ) = 
Unix?) - Unix,) as the sums of independent indicators 

Pn(Xl) = Tfl + ... +■>)„, 

where •>), = f(X, < x t ), '" - 1, ..-,«, and 

6*(xi, xi) = r, + ... + r* 

where £, = /<*, < Jr> £ *i), / » I, ...,«. 

1.26. Let x, < j& < ... < jr^_ i be given points on a number line, 
such that < Fixi) < F(x 2 ) < . . . < F(x K - ,) < I. Examine the ran- 
dom variables w = M*/) ~ Mx-i), '"- 1 // (here ^,(jr ) = 0, 

lL„iXN) — fl), and make sure that the random vector ■> = (!>], .... p.v) 
has a polynomial distribution M(rr, pi, ,.., pw), where p, = 
F<x<) - Fix,-,), i = 1, .... at, F(jrt,) = 0, F(jf w ) - 1. Derive the result 
obtained in Problem 1.25. 

1.27. Derive the formulas 



E/W = E£* = cut, cov {Am*, A nl ) 



Oft. 



EC 1 - " _ ' 



ra \ n — 1 / 



f* = E<{ - a,,)*, cov (X, 5*} = 2-s^ H3 

Hr 

for the moments of sampling moments. Calculate the moments for 
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1.28. Prove that for any fixed r > 2, I < Jti < ... < Jt,, the joint 
distribution of the sampling moments A„t,, . . ., A„k, as « -* °° i s 

asymptotically normal as / I or = (<**, , . . . , a*,), —El. where 

\ _ » / 

E = |<ty = en,**, - cti[,a*J], «.e., -/"(>/«(/»„*, - a*,). J = 1, ...,/■) — 
. .! (0, £) (it is assumed that all the theoretical moments exist). Besides, 
if ^(x). x = (xi , .... &\ is any differen liable function, then 

j?tfn(v{A„x , Ank,) - <?ia))) ~* .^(0, v 1 ) under the condition 

that v p(0, where 

\ dXi far J J I - H 

I Hint. Apply the Central Limit Theorem for vector random varia- 
bles and the assertion 4" from Sec 1.4. 
1.29*. Prove that for n — « the sample variance Sjj is asymptotically 
normal as ,i\ta, (ju — d£)/n) and then ESj — pi, DSii ~ (j* - /ilV* 
(we assume that fu < ">)■ 

\Hlnt. Use the assertion 2° (a) from Sec. 1.4, 

1.30. Prove that the joint distribution function of two order statis- 
tics Jf w and X&>, ) < r < s ^ n, has the form 

n n.-m 

mar J- fBMXfQi s - m) 

x (Ffe) - K(jr,)y(i - F(x2)r~ m - j 

if Xi < jti, and the form 

«i«Cfc, **) - P(*« <S *i) - F,(*z) = S Cj^CttKl - fWT"' 

if *i ^ *j. Using this, derive the formula for F,(jr) = P(X W ^ *). 

1.31. Let the distribution jf{.%) be absolutely continuous and its den- 
sity be F'(x) = f(x). Derive the formula 

gkt..-*,(Xt, • ■ -, x r ) 

m n! 

(fci - l)!(*i - k, ~ 1)! ...<*r~ kr-i - lY-(n - k r )\ 

■x ^'-'(xMFfo) ~ Fix,))* 1 '*'- 1 ... 

x (F(x r ) - F{x r . ,))*'" "" » -, - F(x,),"- k 'f(xi) . . .fix,), 

X, < Xz < . . . < X r 
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for the density of the joint distribution of the order statistics 
Jf(*,), . . ., Xik,}, I € *i < --. < *> C «• Specifically, the joint den- 
sity of atl the n order statistics X<\ it ..., Xw is 

gl. niXl, .... x n ) = *!/<*[).../(*..), X] < *i < ... <*«. 

t.32*. Prove that if in some neighbourhoods of the quantiles ? p , 
and f„,, < pi < pi < 1, the density /Or) is continuous together 
with its derivative and /(£„,) > 0, ( = I, 2, then as n -» °o the sample 
quantiles Z n , Pl = ^* ( [ flM + i>, ' = U 2, are asymptotically normal as 

,/' (<!>,. t*Z. - Mr] i *here o i} = g^ ~ " } , / <>. Generalize 

\ "J flfpjflipj 

the proof to the case of r-quantiles. 

1.33*. Prove that for a sample from an absolutely continuous distri- 
bution the extreme order statistics X iri and X tn - s + i> are asymptotical- 
ly independent as n -» oo for fixed r, s ^ 1. 

I Hint. Go over to the random variables *„ = nF(X&) and 
y„ = n[\ — FlXf„- 1+ i])] and use the result obtained in Prob- 
lem 1.31. 
134. Let S{£) = r(l, 1), Prove that the random variables 
Y, = (n ~ r + l)lx ir> - X(r. ,y), r = 1, . . . , n, X m = 0, are indepen- 
dent and similarly distributed with the density J(x) = e ~*. x > 0. Cal- 
culate EJf<*), DJf<*> and investigate the asymptotic behaviour of RX <H) 
and DA" ( „, as » -* «. 

V n 

Hint. Use the formula £ Jf , = £ t« - *" + D0*v - x,-i). 

Ta L F«l 

jtb = 0, and the result of Problem 1.31. 
1.35. Make sure that in the case of .^"(Q = R(0, 1) the distributions 
of the order statistics have the form 

S{X W ) = B(k, n - k + 1), 
/(^o - X ik) \ = B{i - k, n - (+ k + 1), 

1 ^ k < / <; n. Calculate the means and variances of these distribu- 
tions and cov (X ( icy, X m ). 

l.3ti. Let ^(t) = R(a, b). Prove that the density of the joint distri- 
bution of the extreme values X^ and Xw of a sample has the form 

' (X2 - XiY" 2 , a ^ x, %X2 ^ b. Derive the formulas 



{b - a)" 



na+ b vy _ a + nb 

n + 1 n + 1 
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nx - nx - n ^ b ~ "^ 

(n + I) 2 (n + 2) 

cov (*,„, *,„,) = <ft " a l _ _ . 

(n + ir(« + 2) 

137. Let ^(f) be Weibull's distribution W(a, a, b). Find the distri- 
bution of the minimum value of the sample X<\\ and compute EA"<d 
and DJf<i>. 

1.38. Let Xi = (ATfi, Xiii, i = 1, .-.,«, be independent observations 
on a two-dimensional random variable £ = (fa, fa) with the distribu- 
tion function F(xi, x*). The empirical distribution function is 

n 

Fnix u Xi ) = - /,I(X n ^ Xi)I{Xa H x z ) 



(compare it with (1.1)). Calculate EF„{xi, XiJandD/^i, jft) and show 
that F„(xi t x{) ~* F{xi, xz) as n ~* <w. Construct the sample correlation 
coefficient on and show that e« ~* g = corr (£ ( , fa) if Ef^fl) < oo 
and Dfa > 0, y = J, 2. 

1.39. A distribution ^S which depends on a parameter a is said to 
be reproducible in this parameter if the independent random variables 
£] and fa distributed as -A, and j£,, respectively, satisfy the condition 
-/■(£] + fa) = -4, +a, (this is sometimes written as -£, * j^, = 
■-s£,*n,i where * stands for convolution). 

Make sure that the following assertions are true: 

(1) -/'(,!,, <7?) * . /Ifjia, "2) = ^>i + *«, <r\ + «b; 

(2) r(o, Kt) * r<o, xo = r<<j, x, + Xa); 

(3) M(n,\ pi, ..., p K ) + M{nr> Pi, ■■-, Ps) = M(n, + /t 2 ; 
Pi,.. .,/?,v); specifically, Bi("i. P) * **(«*. jp) = Bi(m + m, p); 

(4) ryxo * n(x_j> = n(x, +_Xj>; 

(5) *j(n, p) * Biin, p) = »i(n + n, p). 
Hint. Use the fact that the characteristic function of a sum of 
independent random variables is equal to the product of the 
characteristic functions of the terms. If the random variables are 
discrete, it is more convenient to use the generating functions E* £ 
instead of the characteristic functions Ee iIf . 

1.40*. Suppose that a random vector V has a non -degenerate normal 
distribution - -f{.n, E) and Q = (V - p}' A(Y - n), where the matrix 
A satisfies the condition A = AEA, Prove that ,j?(Q) = x*("0, where 
m = tr (AE). 
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In the case of A = E" ' the number of the degrees of freedom m 
coincides with the dimensionality of the vector Y. 

1.41. Lei the joint distribution of two random variables X and Y 
be such thai the conditional distribution of X when Y = y is 
^Xk, ffi)-norma(. while j?(Y) = AX^, o|). Prove that y(X) = 

-*t* "i + oi). 

Hint. Compute the distribution density of X by the formula 



fxix) = J f x ,yix, y)fyiy)dy. 



where J x r (x, y) is the density of the conditional distribution 

j?tx\ r = y). 

1.42. Suppose that the random variables X t and Xi are independent 
and 

jt%X$ = T(fl, X), j?(X % + Xii = T(#, X + $, f, > 0. 

How is the random variable Ai distributed? 
I Hint. Calculate the characteristic function for Xi (see the soiu- 
I tibn to Problem 1.39 (2)). 

1.43. Let ft and fe be independent random variabJes uniformly 
distrib uted on the segment [0, 1], Show th at the quantities 
?j] = ^ -2\t\ & cos (2t£ j) and i» = s/ -2 In fa sin (2t{i) are in- 
dependent and normally distributed with the parameters (0, 1). 

I Hint. Use formula (1.2). 
144. Let the random variables X\ and X% be independent, and let 
~f{Xi) = V(a, X/>, I m 1, 2. Prove that the random variables 
Y, = Jfi + Jf z and ^2 = Jfi/(Jfj + J^j) are independent, and 
VCr*,) = T(o, X, + Xj) is the reproducibility in \ (see Problem 1.39 
(2)), J ^n) = fl(Xt,X I ). 

1.45. Prove that -^(— ~ -J ■-> .1(0, 1) as « - » and Efcjjr* = 

w(fl + 2)...(n + 2(* - 1)), k = 1, 2, .... 

| /fjnr. Use the reproducibility of the distribution T(o, X) with 
| respect to X and apply the Central Limit Theorem. 



1.46. Show that J"\ a + — 1 • C(«), where the random variables 

£ and rt are independent and .^(0, er 2 ) -normal, and 
jfts + tan Q = C(o), 



where _y(f) = /? 



(-H) 



1. Principles of Statistical Description 31 

1.47. Show lhat if S{t n ) =*■ S(«). then the moments ErJ exist if and 
only if k < n and are of the form 

v,*' - ' x 3 * ' ■ - x (2r ~ 1)n ' ? r *■ » 
"' ~ (b- 2)(n-4)...(n-2r) > ' 

E/? + ' = 0, 2r + 1 < n. 

Prove thai S(n)-*./"(0, 1) as »-•«», and, moreover, the density 

&tt - -7^ * -** Show that ^f/-— ' ) =b("±). 

1 /finr. Use Stirling's formula for the gamma function F(z) — 
"J2~kz Z* ' *e ~ *, z *"* °°. and apply the law of large numbers to 
the random variable x*/rt (see the solution to Problem 1.45). 
When calculating the moments, take into account that 



! - = 1, Ji 



and the terms are independent. Use Problem 1.44. 



1.48*. Let F{x; n it n z ) be the distribution function of the Snedecor 
law S(i»i, nj), and let B{x; a, b) be the distribution function for 
B{a, b). Show that 

F(Jf, «i. m) = B ( -; -^, — 1, x > 0. 

Derive the expression for the density /„,, ni (Jf) of the distribution 
S(n,, n?). Find the distribution moments. 

1,49* (Continued from Problem 1.48.) Prove that for any fixed 
x € (0, 1) and a > 

lim I In [1 - B(X, «. Wl = In (1 - x). 

Show that for any fixed 1 > and m ^ 1 

lim 1 In [1 - F{tn% m, #t)l = - I In (1 + «f). 

ir-w " 2 

i/i/if. Use Stirling's formula (see the hint to Problem J.47) and 
the theorem 

i i 



~'du 



jt j 

= ^(i -j**, ce[*. n, 

about the mean. 
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I. SO*. Show that the density s„(x) of Student's distribution S(«) can 
be expressed through the density f\, n (x) of Snedecor's distribution 
S(l. «J as 

Prove the relation 

lim - In P(( n > dVn) = - 1 In (1 + d z ) vd > 0, 
„-_ n 2 

| Hint. Use the fact that V(f™) = S(l, n). 

1.51. Let X = (Xi, ..., Xt, Xn.,, .... Xi+ m ) be a sample from 

the exponential distribution ^%£) = T(a, 1). Investigate the random 

variable Y = ^ — -ll ' 'J — ff - and prove that S(Y) = 

I Xl + ] + ... + Xt+n, 

Sill, 2m). 

\Hint. Use the Tact that jf(r2) = SO, *). 

1.52. Let an integer-valued random vector v = (vi, . . ., vn) have 
the polynomial distribution M{n\ P\, . ■ . , psi. 

(a) Show thai the generating function for (i>\ , . . . , p*), k % N, has 
the form 

ECrf.. .rf) = j 1 + £ Mx, - 1) j , 

and, specifically, -z?(v\) = ffl'fn, />i). 

(b) Derive (he general formula 

E("l)*, ■ ■ - {»»«)*».■ = ("V, + ... + ksPt ' ■ . .Pif 
for mixed factorial moments. 

(c) Let V = 2 c^,, P = 2 cJA, / = 1- 2, ?? o 2 <&£*■ 
Show that 

EV = wP, cov (V. l) 1 ) = fffe'e 2 - c 1 ? 1 ). 

| Hint. Use the result obtained in Problem 1.39 (3). 
1.53*. (Continued from Problem 1.52.) Prove that as n -* °o for any 
k < N and fixed p,i (0, 1), I = 1, . . ., A^, 

-/ft* - npj)/^, j= 1, ...,k)- ^(«. 2* = Mf). 



where 



= f M1 



.Ci-p,) for f=; 
' -/>i/J, for I #j 
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I Hint. Use the theorem on the continuity of characteristic 
! functions. 
1.54. Prove that if the random variables £1 , . . . , fjv are independent 
and ./'(&) = ri(X,),y = 1, . . . , N, then the conditional distribution is 

~^(fw ■ ■ .. Swlgt + - ■ • + frf = n) = M{n; pu ■ ■ ., p*). 



where pj = , j = 1, . . ., /V. It follows from here that 

Xj ■(-...+ Xaj 

~«f%|6 + ...+• fe - ») « Bi(n, p>). 

1.55. Suppose that we have two random variables £ and A and 
-AA) = F(a, r ) for some a > and integer r ^ 1, and we also have 
a conditional distribution -/'(|JA = X) — IT(X). Show that the uncon- 
ditional distribution _/*(£) = Bj'fr, p) for ^ = a/(o + 1). 
_1.56. Let X_= (JC, , . . . , XJ be a sample from ^V{fL, a z ). Prove that 
Jfandj^fi - X, . ...Jd - JO are independent. Prove that the sample 
mean X and variance S 2 are independent. 

\Hint. Use the fact that the uncorrelatedness of normal random 
| variables implies their independence. 
1.57*. Suppose that X = (Jd, , . ., X„) is a sample from .--K(0, 1), 
and the quadratic form Q = X ' X is represented as the sum 
Q = Qi + Qi of two quadratic forms, where Qi = X'AjX and 
rank 4t = «/, i = 1,2. Prove that if rti + «2 = /t, then Qi and Qi 
are independent and -AQ;) = x 2 ("<)>'='. 2. 

I Hint, Check whether the matrices Ai and Ai are idempotent and 

1 A[A 2 = 0. Then use the assertions 1° and 2° from Sec, 1.6 (1). 

Remark. A stronger assertion is true, ije., if Q = Q\ + ... + Qt, 

where Qt = X'A,X, rank A, = «,, i = 1, ..., k, then 

« = Hi + ...+«*» Qi , . . . , Qk are independent and -/"(Qf) = 

X z to), i= 1, ..., A. 

1.58*. Let X = (AT>, .... Xi) be a sample from the normal distribu- 
. tion -fOi, tr 2 ). Find the distribution Of the random variable 

- Xl ' * 

~ Vn - IS ' 

1.59*. Use the notations of Problem 1.38 and assume that 

mb-^Uh,^ e=H T b IY -i< e <i. 

\ Jonne "2 \j 

(a) Prove thai (.Xi, JTi)_and (S?,_Si2, S|) are independent. 

(b) Assume that Q ■ n(A"i - pi, A'i - |i 2 } ' E " * (ATi - jii, Jf 3 - /» 2 ) 
and prove that ^\Q) = x 2 (2). 

3— 8B<> 
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(c) Let n > 2 and T = -Jn - 2cWl - qI, where g n = S^/SiSt 
is the sample correlation coefficient. Prove that for q = we have 

-^(71 = S(« - 2) 

and find the distribution for @„. 

tfmB. (a) See Problem 1.56 and its solution. 

(b) Use Problem 1.40. 

(c) Use the fact [3] that the density of the joint distribution of 
the random variables (S 2 , Sn, S 1 ) has the form (for q = 0) 

4 5r r(n- 2)(<7 1 02)''- 1 

Jfi, *i > 0; jcJj < jrua. 
Then consider new random variables 



<n S 2 of\ s\J 






3 2 «2 



and takejnto account that T= Yi/^Yi/(.n - 2). 
1.6ft*. Let jf and S 2 be the sample mean and variance for a sample 
of size n from the distribution Ti(X). Prove that 



_/■ 



(n-^^-^)^)^,!) 



as n-»«i and for any X > 0. 

i/rnf. Use Problems 1.27-29. First show the asymptotic nor- 

mality ^K(0, I) of the random variable f„ = J — - — x 

Mf ^-t-5M /K and then use the fact that X~fh ^ I as 

n — «■ and the assertion 2° from Sec. 1.4. When computing the 
moments, use the formulas 

M= t» = \ m = X + 3X Z 

for central moments of Poisson's distribution II(X). 
1j61. Let Xi , . . ., X n be independent observations on_a random 
variable £ with E£ - /j, D{ = u 2 > 0, Ef 4 < oo, and lei X and S 7 be 
the respective sample mean and variance: Prove that for n -» oo we 
have 

^(r„ - &(x- ft/si ->.A'($, i). 
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I Hint. Use the Central Limit Theorem, the convergence S/a ~* 1 
as n —• oo, and the assertion 2° (c) from Sec. 1.4. 

1.62. Suppose that .^Xfa, h) = ^Um, |e). T^ ^"^l)' 

where — 1 < q = corr (fa, fa) < 1. Prove that the conditional distri- 
bution is -j^(fa|fa = ,*) = -^{fn{x}, o 2 ), where the conditional mean 

mix) = E(fa|fa = x) = nz + — g(x - in) 

is a regression function of fa on fa , which in our case is a linear func- 
tion in x, and the conditional variance <r z = D(fa|£] = x) = 
fiO - C 2 ) does not depend on x. Check whether there are other distri- 
butions of the form _^(£i , fa) which are not normal but have the same 
properties as the conditional distribution S(ki | fa = jr). 

Hints. (1) Calculate the conditional density fa under the condition 
fc = x using the formula f il \ (l {y ]x) -/(,;,&, y}/f ( ,(x) and make 

sure that it is equal to ■ — ■■ exp J - ■ - "*^Q - j with mix) 

and a z as given above 

(2) Consider the joint distribution density fi,h(x, jr) = 

/dMAls.O'l*). where/ (j | El O'|*) is the conditional and/ t ,(x) the 

unconditional distribution density. 
1.63*. (A generalized variant of Problem 1.62.) Let X fl) , X (2) be ran- 
dom vectors of arbitrary dimensionality, E(X W ) = j» w , D(X C ") = E„, 
C=1 T 2, cov(Xt", X«*) = E 12 , cov(X<*>, X (,) ) = E 11 (= E.y, 

./XX<'\ X< 2 >) = ^(i„ u \ *«>), E = |^ ^|) , |Sj * 0. Prove 

that the conditional distribution -^(X w |X"> = x (1) ) = -4S{M{x m ), B), 
where 

Mi*™) = M H> + A(x (t > - *»>), A = EnS,-. 1 . 

B =• E12 — SuSfi'Sn. 



Show that when dim X w> = 1. dim X (l) = p - 1. we have 
A= -4s(<f". fflp - ■■■- ^"'' P ). B = 



where E" 1 = |a y |f. 

I Hints. (1) Consider a linear transformation Y <!) = X <1> . 
V (1) = X w - AX (1> and make sure that Y<" and ¥ tt) are in- 
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dependent. Then 

V(X (2 >|X (I) - x<") =^V«> + AY (I >|Y<'> « x (,) ) 
= ^V B) + Ax<°). 
(2) Write the joint density 

C exp J -i ^j (* - W )(jrj - w )o* j 



and show thai the variable jt> in the expression Tor the conditional 
density has the form 

p- 1 

exp j^- l - «» (x P -,* + Yj& ~ mW/^y^ ■ 

1.64. An algorithm to simulate a random variable with Poisson's 
distribution II(X) is based on the following fact (prove this!). Suppose 
that Ui, i =» 0, I, 2, . . ., are independent random variables uniformly 

distributed as R(0, I) and £ = max J *: JJ U, >e~M, then 

^'" = n(x>. / * \' 

Hint. Using the relation -f{ *■ JJ In t/ ( J = T(l, k), calculate 

the probability of the event [£ = *]=)- 2 ln W < X, 

- 2 I" Ui > \ [ . 
'-' J 

1.65. Let a bounded distribution density /fcr), c = max f(x), be 

given on the segment [a, 6]. We define the random variable 
p = min [i > I: cf/2,-, </{o + {b - a)Un)), 

where ( Ui j are as in the previous problem. 

Prove that the random variable £ = a + (b - a)Ui r has the distri- 
bution density f(x). ; 

Remark. This result gives a simulation technique for a distribution 
with an arbitrary density, which satisfies the indicated restrictions. 
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Estimation of Distribution Parameters 



2.1. Suppose that we have a statistical model ,?*"= (/? j r or a scheme 
of repeated independent observations on a random variable £ and 
X = (Xi, .... X„) is a sample from -A£). Any random variable 
T = 7"(X) which is only a function of the sample is called a statistic. 
Given a sample X, we often 1 have to estimate the true value of an 
unknown theoretical characteristic g = g(F), i.e., to construct a statis- 
tic 7"(X) which can be used as; a reasonable approximation of the true 
characteristic g. In this case the statistic 7"{X) is called an estimator 
for (of) fi- Various estimators are used to estimate g, and wc may com- 
pare their quality l*y the measure of accuracy (the degree of closeness 
to the true value of the estimated characteristic). If we are given a 
class of estimators % and the measure of accuracy is chosen, then 
the estimator which optimizes this measure is called an optimum esti- 
mator (in the class J^). 

The most popular measure ; of accuracy is a standard {mean square) 
error E(7XX) - g) 2 . This measure brings about a respective optimality 
test, i.e., a minimal rtiean-square-error test. We often restrict ourselves 
to the class .SJof unbiased estimators, viz., T = T(X) €-%* ET — g 
VFi.9\ For the unbiased estimators E(7" - g) 1 = OT, i.e.. their vari- 
ance is a measure of their accuracy, and a minimal variance rest is 
in this case ihe optimality test. If a model &~ is parametric, i.c, 
ff'— [F(x; 0>, 8 € 8 ). any theoretical characteristic is a function of the 
parameter ft Then we are dealing with the estimation of parametric 
functions denoted t(0). The statistic T = 7\X) is an unbiased estima- 
tor for r(6) if the relation 'E« T = t(6) V0 6 e is true. A statistic T* 
for which DsT* < Drvr^^Tand V8 € is an optimum estimator 
in the class ^"of the unbiased estimators for the function t = t(0). 
We sometimes use t* to denote T* in order to stress that it is related 
to the function t(S). The optimum estimator T* (for a given model 
y and a given parametric function r(fl)) does not always exist, but 
when it does exist, it is unique [7, p. 55], It is important thai the opti- 
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mality is linear, i*., if Tj is an optimum estimator for tj = tj(0),j — 1, 
2, ..,i then the statistic ^cjTJ is an optimum estimator for a linear 

combination 2jOT> |7, p. 58], 

Consistency is necessary for any estimation rule. This means that 
as the size n ot" a sample grows, the estimator must converge in proba- 
bility to the estimated characteristic, whatever the true distribution of 
the observations. Thus, consistency is art asymptotic property of esti- 
mators (in contrast to unbiasedness and optimality). When we want 
to stress that the statistics we are studying depend on sample size, we 
label them with the subscript n. When investigating the rule for con- 
sistency, we use the following simple test [7, p. 91). If Ee7*„ = 
t(0) + e n , D»7"„ = & n > and e n — e n (0) -* (i.e., T n is an asymptotically 
unbiased estimator for t(0» and d„ =• B a (0) -* as n -» » for all 9 € ©, 
then T n is a consistent estimator for r(0). 

2.2. We now consider the general tests for existence of optimum 
estimators and the ways to construct them in the framework of the 
general parametric model &~= [F{x\ 0), S © j . Let/(x; 0) be the distri- 
bution density of the observable random variable £ (or the probability 
of the event ( £ = x J in a discrete case), ; and let x = (*i, . . . , *„) be 
a realization of the sample X = {X\ , . J . , X n ). For fixed x € ST the 
function L(x; ff) = /(xi; ff) . . ./(*„; 6) of the parameter € © is called 
a likelihood function. We will assume that Lix; 0) > for all x 6 8T 
and e © and it is different! able with respect to 8- Moreover, the fol- 
lowing rule for changing the order of differentiation and integration 
(when 6 is a scalar parameter) is valid: for any statistic T(in particular, 
for T = const), we have 



= \™Te 



~- I T(X)Z.(X; 0) t.'r. - \ T\ ■ ! ; ;:; : ,: .- (If . d-. - (lr, . 



(Integration is carried out over the entire sample space 3£, the integrals 
are assumed to be absolutely convergent for a(l e ©. For discrete 
models integration is replaced by summation.) Finally, we introduce 
the random variable 

,™ „ 3 In MX; B) _ ^ 3In/(X; 6f) 
UiX, 9) o ^—— - £j , 

Ml 

which is called the sample contributiorii and we will assume that 

0< EeU 2 (X; 0) < *>; vfl € 6. 
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Models for which all these conditions are met are called regular. 

For a regular model E»U(X; 0) = Vfl e e the function 
i a m = D„t/(X; 6) = E s t/ 2 (X; 0), which is called (.Fisher's) informa- 
tion function^ is defined. The quantity 

is also called the amount of (Fisher's) information contained in one 
observation (the latter expression is used when the function /(jr; 0) 
is twice differentia ble with respect to 6). In the case of repeated in- 
dependent observations i n ($) = ni(6). 

The introduced notions can be generalized to the case of a vector 
parameter 6 = (Si, . . . , ft-). Then the random vector 



U = (l/i(X; ff), .... C/,(X; »), 



where 



UjOti 9i~-~ia LQCl »\ j = I r, 

is a sample contribution, arid the information matrix I n = lni$) =■ 
D«(U) = Ej(UU') of the sample is an analogue of the information 
function. The information matrix Ii = I = |gy|; can be calculated by 
the formulas 

go - guV> = t. ^ M) ^ J 

\ OVlQVj / 

the last equation being true if the functions /(*; ff) are twice differen- 
tiable. 

For repeated independent observations wchavel„(0) = nl(0). In this 
case the definition of a regular model implies that the matrix 1(0) is 
non-singular for all 6 € ©. 

We can find lower bounds for variances of unbiased estimators for 
a given different iable parametric function riff) in a regular model, in- 
deed, for any estimator t = 7"(X) € ffj and alt $ e 9 the inequality 
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holds for a scalar parameter 8, and the inequality 

D,T > b ' (0)K '(«bW, b(0 = (^ ^) 

holds for a vector parameter 8 = <fi\, . .., 9,). This is the Cramer-Rao 
inequality. The estimator T* e .ST for which the indicated lower bound 
is reached is said to be efficient. If an efficient estimator exists, it is, 
consequently, optimal (in the class ,%) and unique. The representation 
[7, p. 611 

TQQ - t($) = a(8)U(X; 8) if 8 is a scalar, 
7\X) - r($) m a'(tf)U(X; 0) if 8 is a vector 

is a test for efficiency. Here a (8) (a ' (9» is a function (vector function) 
of 8. 

In a given model .?" an efficient estimator can only exist for one 
parametric function r(9) (up to the transformation ar(8) + b, where 
a and b are cansrants). 

If there is no efficient estimator, then we use the Bhattacharyya test 
[7, p, 64] to find an optimum estimator T* — t* (in the class of unbi- 
ased estimators X), i.e., taking into account the higher-order deriva- 
tives of the likelihood function L = L(X; 6), we choose a linear 
combination of them in order to obtain a representation of the form 



88,38, 
14 






d'L 
a '.< --*■ ^s 



We successively put here s - 2, 3, . ... If we manage to do so for 
some s^2 and the coefficients a. = a_0), then the statistic T — T(\) 
is an optimum estimator for the function t = r(fl). 

23. The most effective way to construct optimum estimators is to 
use sufficient statistics. A statistic 7" = 7"(X) (generally a vector statis- 
tic) is said to be sufficient for the model S^= \F(x; 8), 8 6 &) (or for 
the parameter 0) if the conditional density (probability in the discrete 
case) t(x\ t\ 6) of a random vector (sample) X — (A"i X„) is in- 
dependent of the parameter under the condition 7 T (X) = f. We may 
use an equivalent definition, i.e.. for any event A C ifthe conditional 
probability P«(X e A \ T(X) - f) is independent of 8. This property of 
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the statistic T implies that it covers all the information about the 
parameter & contained in a sample. Indeed, the probability of any event 
which can occur at a fixed T is independent of 6 and, hence, it has 
no additional information about S. The sample X is obviously a suffi- 
cient statistic, but we usually seek a smallest-dimensional sufficient 
statistic which represents the original data in the most compact form, 
i.c, we seek a minimal sufficient statistic. A minimal sufficient statistic 
is a function of any other sufficient statistics. We use the factorization 
test [7, p. 7fJ] to construct sufficient statistics, \/s., a statistic 7"{X) is 
sufficient for the parameter 8 if and only if the likelihood function 
can be represented as 

L{%; 9) =g(T(xY, fl)A(x). 

where g and h are nan-negative functions, and h is independent of 
6. If 7" is a sufficient statistic; then any other function which is in 
a one-to-one correspondence with T is also a sufficient statistic 

The Rao-Blackwell-Kolmogorov theorem [7, p. 72] defines the role 
of sufficient statistics in estimation theory. The theorem states that 
for any unbiased estimator 7\ of a given function r(fl) we can con- 
struct a new unbiased estimator T* m E«(7"]|7) which depends on the 
sufficient statistic T and obeys the inequality t>»T* ^ DsTi. Conse- 
quently, an optimum estimator should be sought among the functions 
of a sufficient statistic 

We use an important property of completeness of a sufficient statis- 
tic in order to find an explicit form of optimum estimators The statis- 
tic T is said to be (boundedly) complete if for any (bounded) function 
<P<X) the equation E^(T) = vS implies that <p{t) a on the domain 
of T. 

If a complete sufficient statistic exists, then every function of it is 
an optimum estimator of its mean. Consequently, when estimating 
a given parametric function rid), we find an optimum unbiased esti- 
mator t*, which is a function t* = H{T) of a complete sufficient 
statistic T and satisfies the unbiasedness equation E 8 //(7) = 7(9). 
This equation either has a unique solution or has no solution. In the 
latter case the class 3^ of unbiased estimators t(0) is empty. 

Many models in mathematical statistics belong to an r-parametric 
exponential family, i.e., for them the function fix; ff), 6 = 
(fli, . .-, fl f ) € © C R' can be represented in the form 



fix: 9) = exp j 2, *;S;(*) + <?(*) + Dix) J 

(or it can be reduced to this form by a change of the parameters). 
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Then T = (7",, .... T r ). Tj = 7)(X) = 2 *(-*!>. y = 1. . ... * is a 

v- I 
minimal sufficient statistic and it is complete if dim = r. 

2.4. The maximum likelihood method is universal among the esti- 
mation methods for unknown parameters. Given a sample X ■ 
(Jfi, . . ., X n ), we seek for a maximum likelihood estimate (m.le.)'fl„ for 
the parameter 6, which is a point of parametric set 9 where the maximum 
likelihood function L(x; 0) attains its maximum for every X = x, 
i.e., £(x; fl») 5s L(x; 0) V6, or L(x; 0„) = sup £(x; fl). If for any x € .?' 

the function L(x; 0) attains its maximum at an internal point of 
9 and L(x; (?) is differentiabie with respect to e, then the m.lje. f?„ 

meets the likelihood equation d ln Lt *~' *> = /'or JJF.KSfl = 0, 

Be \ d0j 

y- i n if »= (0 . wV 

For a parametric funaion t(0), the m,J.e. is r„ = t(0„)_ This is the 
invariance of maximum likelihood estimates. 

When a likelihood equation cannot be solved exactly, we turn to 
approximate methods of solution. One! of them is a recurrent accumu- 
lation method (due to Fisher), according to which the (k + l)th ap- 
proximation for a mite, is computed by the formula 



0**i = &k + t/(x; 0k)/ni{0K), k = 0, 1, 2 

Here we choose an easily computed consistent estimate for as a first 
approximation Sa- 
in the case of regular models the maximum likelihood estimates 
have important asymptotic properties. Indeed [7, pp. 92-95], if &, ex- 
ists, is unique, and lies inside 9, then it is a consistent estimate for 
and its distribution is asymptotically normal, viz., 

^S(vn(ft, - «>-./m r'(«» 

as » -* oo . If we additionally assume that the function f(x; 0) is three 

In^ fi\*~ ffi I 
-—-! — - s£ M(x), where 
dVid9jt)»n I 

the function M(x) is independent of 8 and integrable, then 
E«A/(f) < oo. If the elements of the matrix 1(0) are continuous in *, 
then 

Moreover, if t(8) is a continuously differentiabie function and 
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f„ = r(&i) is its 1*1,1,6., then 

as n-> oo, and 






3t(0)\ 



where <r?(9) = rj'WI" '(«)b(fl; ft(fl) = I ^p , ,.., ^— ^ ) - The 

\ OtTi Ofr } 

quantity o*ifi)/n is called the asymptotic variance of the statistic f„ 
and coincides with the Cramer-Rao bound for the variances of the 
unbiased estimators for the function r(0). This property of maximum 
likelihood estimates is called the asymptotic efficiency. 

If we have another consistent and asymptotically normal estimator 
r„ for the function j-(0), viz., ~4(-fn(T„ - T(tf») - ^'(0, (t£<0» as 
n -*■ oo, then its "quality" can be measured by the quantity 
eff (7" n ; ff) o <rJ(fl)/ffT(fl). which is called the asymptotic efficiency of 
the estimator T„. The estimator is the better (the more exact) asymp- 
totically, the greater its asymptotic efficiency. For m.l.ris this quantity 
is equal to unity. 

2.5. Until now we have considered point estimation of the unknown 
distribution parameters though mathematical statistics also deals with 
confidence interval estimation or (for vector parameters) with estima- 
tion by confidence sets. Let be a scalar. In interval estimation we 
seek two statistics T, = Ti(X), i = 1, 2, such that 7*i < 7i. For these 
statistics the condition 

p e (7-,<x>< e < t 2 (X)) > 7 veee (*) 

must hold at a confidence level y € (0, I). This (random) interval 
{Tu T 2 ) C is called a y-confidence interval for 0. Its length T% - T\ 
characterizes the accuracy of the unknown parameter localization, 
while the 7-confidence level characterizes its "reliability", i.c, the prob- 
ability that the assertion e (7\ , 7* 2 ) is erroneous does not exceed 
1 — 7. In practice -y is taken 'to be close lo 1 (7 = 0.95, 0.99, etc), 
and then the shortest (for a given class) interval is constructed for 
the chosen 7. 

We sometimes use one-sided confidence intervals (upper, of the 
form 8 < r 2 (X) t and lover, of the form Ti(X) < 0), which are defined 
by the conditions similar to (*) but without the second limit. 

In the case of a vector parameter the confidence interval for a 
separate component (for example, 0i) is chosen in a similar way, viz., 

SV<n<Xi<*,< 7i(X))> 7 vffee, 
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as well as the confidence interval for a parametric function r(ff) 
P.(7",(X) < r(fl) < r 2 (X)) > 7 V0 <■ 9. 

A y-confidence region for a vecior parameter 9 = (0i , . . . , r ) is 
a random subset J^(X) c 6 for which 

P*(0<E.:#y<X»>-y V»6 6. 

This subset is constructed using a statistic T{\) whose distribution 
is known. 

If we are estimating a scalar parameter and know that there exists 
a random variable G(X; fi) which depends on the observations 
X = (Xi , . . ., X n ) and on the estimated parameter, such that (1) the 
distribution of C(X; S) is independent of 6 and (2) for every x € £sf 
the function G(x; 8) is continuous and strictly monotone in 6 (in this 
case G(X; #) is called a central statistic); Ihen the 7-canfidence interval 
for is constructed in the following way. We define the numbers 
gi < £2 from the condition P#(gi < G(X; #) < 52) = 7 and solve the 
equations G(X; 0) = gi,g 2 with respect to 6. We use 7} = 7j(X), 1 = 1. 
2, 7\ < Ti, to denote the solutions and find the required interval 
Ci, T:)- The technique based on centra! statistics can be applied to 
estimate the components of a parametric vector 6 — (<?] „ . , . , 6 r ) and 
scalar parametric functions r = r(6). 

If we have some point estimator T =1 7\X) for the parameter 0, and 
its distribution function F(t; 0) is continuous and monotone in 0. then, 
having found from the equations (with respect co 8) 

F(T; «) = (!- y)/2, I - F{f - 0; 6) - (1 - <y)/2 

two random numbers 7*i = 7j(X), / = 1, % T, < T lt we find the cen- 
*/w/ y-confidence interval (T,, 7i) for' 0. 

It is sometimes possible to construct 'approximate confidence inter- 
vals for large samples using maximum' likelihood estimates. Thus, if 
r(0)< fl = (fli , . - -, 0?-), is a continuously differentiable function and 
f„ = r(0 B ) is its maximum likelihood estimate, then, in the case of a 
regular model, the interval (■"■„ ± c T <rj(ft,)/Vn) is an asymptotic 7- 
confidence interval for t(#), where <r?(0) = b'(0)i - '(0)b(0), 

»»-(^? 3?)-*-j»"(4--)- •— * 

(£„ ± Cy/yJ ni0 n )) is an asymptotic 7-confidence interval for a scalar 
parameter 0. Such intervals are asymptotically shortest and are based 
on the standard normal approximation 

.4(414 ~-*Crflfe olikyn) 
for a m.l.e. 
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Problems 
Estimators and Their General Properties 

2.1. Show that the following statistics are unbiased and consistent: 

(a) T„(%) = f R (x) as the point estimators for the theoretical distri- 
bution function Fix) at a given point x; 

(b) T„{\) - Ant as the estimators for the theoretical moment 

(c) T„(\) = — y . (Xi — ott} 1 as the estimators for the variance 

n * — * 
i - i 
ji 2 = D£ when the mean « ( = E£ is known; 

(d) 7"„(X) = S 3 s S' 2 as the estimators Tor ^ in the general 

rt — 1 

case. Check whether S 2 is a consistent estimator for m. 

I Hint, Use Problem 1.27 and the Chebyshev inequality (assume 
that the respective theoretical moments do ex ist). 
2.2. In what cases is the statistic T„(%) — •JA n i/2 a consistent esti- 
mator for the theoretical mean e*i? 

23. Given a sample X = (j£ u , . . , X„) from the distribution _/\ £), 
construct an unbiased estimator for its characteristic function. 
\Hin(. Consider an empirical distribution function. 
2.4. Let X = {{Xu.Xn), . . r , {X„i, X nl )) be a sample from a distri- 
bution of a two-dimensional random variable £ = ($i, fc). Prove that 

the statistic T(Xl = — S, 2 is an unbiased estimator for 

n — 1 

tin = cov (Ei, fc>, where Si 2 is the sample covariance (see the solution 
to Problem 1.38). 

I Hint. Consider the random variable f 1 + fz and use the solution 
to Problem 11 (d). 
2.5- Let X = (Xi, .... X„) be a sample from the distribution 
Bi(l, ff). Describe the class of parametric functions r(d) for which the 
unbiased estimators T(X) exist. Show that the functions t(S) = l/$" 
for a > and t(S) = 6" for 'b > n do not belong to this class, 

2.6^ Given the results of n trials, estimate the unknown probability 
of success 6 in the Bernoulli trials Bi(l, ff). Use r„ to denote the number 
of successes in these trials and consider the class of estimators 

7" = . Compute the standard error of the estimator 7* and com- 

n + & 

pare it with the error of an "ordinary" estimator r„/n. 

2.7. Suppose that S(i) ~ Bi{k, S) and « = 1, Consider the func- 
tions of the form r„(0) = S r (l - ff)' for integer r, s Js 0. Show that 
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the unbiased estimator for T rt (8) exists if and only if r + s ^ k and 
then it has the form 

T(X) = (X) r (k - JO t /(k) r+rf 

where {a) r = a{a - I). . .(« — r + 1), r'> 1, (0)0 = I. 

2.8. Let X = (ATi, . ... AT„) be a sample from the distribution 
Bi{k, ff) and 7" = X t + ... + X n . Describe the class of parametric 
functions t{$) for which there exist unbiased estimators of the form 
H(T), Construct such an unbiased estimator for t,(C) = 6*. 

I Hint. Use the reproducibility of the distribution Bi(k, ff) (see 
[ Problem 1.39 (3) and Problem L52 (b)). 

2.9. Suppose that .^(f) = n<0) and n = 1. Check Whether 

T(X) = (X\j is an unbiased estimator for t(0) = &,j = 1,2 and 

show that there are no unbiased estimators for the functions 
t(#) = 0~°, where a > 0. Construct an unbiased estimator for 
(1 + *>-'. 

2.10. Given one observation on a discrete random variable £ with 



" ~ 10 - e " ">, x = 1,2, ... (Poisson's 

1 . ■... - — . _a 



the distribution /(;q 9) 

ji: # 

distribution truncated at zero), estimate the function r(6) = 1 — e~". 
Show that only one_and practically useless estimator exists for it. 

2.11, Let -j?(E) ■ Bi(r, 0) and n = 1. Construct an unbiased estima- 
tor for the function t{8) => 0* (s ^ 1 is an integer) and make sure that 
for r m 1 this estimator is practically useless. 

on 

Hint. Use the formula (1 - ff)~' = jj Ci + j-i^- 
x . 

2.12. Show that T(X} f 2 — > s *&" °f*' v unbiased estimator for 



the function t(0) = In (I — ff) in the model Bi{l, ff) for « = 1. 

— , v 2 

2.13. Make sure that T* = Jf ■ is an unbiased estimator for 

it, , 

the function t(0) = d z in the model -\^(6, a\ 

2.14. Given a sample of size n in Ithe model i/^Xft, S 2 ), estimate 
t{8) = $ 2 . Show that the sample variance S l has a smaller standard 

1 " 

error than the unbiased estimator t* = — Y! (A"; — it) 2 . Which of 

n <,l 

the unbiased estimators (r* or S' a ) is more exact? (See Prob- 
lem 2.1 (d).> _ J 

2.15. Prove that r„(X) = fe - £ (Xj - ><| is an unbiased and 

■\2 ft i = i| 
consistent estimator for the parameter in the model ~/V{n, 6 1 ). 



2. Estimation of Distribution Parameters 47 

2.16*. Let X = (X\, .... Xn) be a sample from the distribution 
n 
jfin, $ 2 ) and T 1 = 2 C*' ~ ") ? - P«>ve that the statistic 
i- 1 

AA 



T t = 



(f) 



2 2 r 



(^) 



■ 7* is an unbiased estimator for the function 



t*(0) = $ k for any integer Ar 5: i. Compare the estimator tJ with that 
from the previous problem. 

[Hint Use the fact that A{T*/8 2 ) = X 2 (m). 

2.17. Suppose that we are given a sample X — (Xi, Xi, X$ ) from 
the distribution . / (0. 8 2 ) and ; T - 7™(X) = -Jx 1 ■+ Jff + X\ . Con- 
sider a statistic Prix) = ^= /(j*! ^ T), where /(•) is ar» indicator, 

which is a function of the variable x and represents the uniform distri- 
bution density on the segment [— T t T\. Show that />?-(>") is an unbiased 
estimator for the density of trie 1 original distribution / (0, fl 2 ) for any 
x. 

\Hint. Use the fact that M^/e 1 ) = x 2 {3). 

2.18. Let us estimate an unknown variance 0§ in the general normal 
model . / (0i . fl|). Let X = (Xi , . ... X a ) be the respective sample, and 
lei S ' z be the unbiased estimator for 6\ (see Problem 2.1 (d)). Consider 
the class of estimators of the form 7x = XS' 2 . Show that for 

< \ < 1 the statistic T\ has a smaller standard error than S . 

n + 1 

I ft 

For what integer k do the statistics T X (^i — X} 2 belong to 

n + k ,_, 

this subclass? Using the minimal mean -square-error test, find an opti- 
mum estimator in the class ;[7V). 

2.19*. (Continued from Probiem 2.18.) Construct optimum estima- 
tors of the form 7\ = XS' 2 , which minimize the measure E a (7\ — fl 2 )* 
and E 9 |7"x — flj|, respectively. 

iHint. Use Fisher's theorem and Problem 1,45. 

2.20. Prove that the statistic 



-w™ 



m 



Fl 3 ) 
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where S z is a sample variance, is an unbiased estimator for the func- 
tion t*(0) = 82 and at integer k > 1 in the model of Problem 2.18. 
Consider the case of n = 2 and: calculate the bias of the statistic 
\Xi - Xz\ which is an estimator for ft*. 

I Him. Take into account that j#(nSVea) = x 2 {n - \)- 
2.21. Let X - (Xi, ..., X n ) be a sample from the distribution 
T(B, \) and T - X, + ■ - ■ + X„. Make sure that the statistic 



r(\») 



T~" is an unbiased estimator for the function 



" T<Xn - «) 
Ta(0) = 8 "* for any a < \n. 

I Hint. Use the reproducibility properly of the gamma distribution 

Usee Problem 1.39 (2)>. 
2.22* The lifetime of electric bulbs is distributed as V<fi, 1). In order 
10 estimate ft we take a sample of n bulbs and observe the lifetimes 
of the first r burnt-oul bulbs X H> < Xga < ... < Xy>. Construct an 

r 

optimum unbiased estimator of the form T(X> = J_, X*J£"<*|. 

km 1 

\Hint. Use the random variables Y, = (X& — A<r-i>)> 

|r=a J, n, iX m - 0), and Problem 1,34. 

2.23. Given a sample X = {X t , ..., X„) from the distribution 
R{8, 20). estimate the parameter ft Consider a class of estimators of 
the form T = r(X> = aXw + 0X in , a, 3* 0, and find an optimum 
unbiased estimator in it. 

[Hint. Use Problem L36. 

2.24. Given a sample X = (Xi, . . ., X„), estimate the parameter 8 
of the uniform distribution J?<0, 6). Show that the statistics 

Tj = ^L± — Xm and T% = (n + I) Jfii) are unbiased. Which of them 
n 

is better? 

I /-TmM. Use Problem 1.36. Show that T 2 is not a consistent es- 

| timator, 

2.25. Let X = (X v , . . . , X„) be a sample from the distribution 
Rtfi, #2). Prove that the statistics T, = {X m + X ia> )/2 and 

T 2 - n + ■ (Xao - X(i>) are unbiased and consistent estimators for 

n — 1 
the functions n(S) - (6, + fl 2 )/2 and Tz(8) = 81 - fl,, respectively. 

(//(«/. Use Problem 1.36. 

2.26. Prove that if X = {X,, X.) is a sample from Weibull's 

distribution W(6, a, b) with an unknown shift parameter ft then the 
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statistic 7"(X) = Xi,> - br ( 1 9 — J n~ Ua is an unbiased and consis- 
tent estimator for the parameter 0. 

I Mint. Use the solution to Problem 1.37. 

2.27. Show that for a logistic distribution with the density 
fix: 0) = e~'*\l ■+■ e "***>"*■ -«e < x < og, 0€(-oo, oo), the 
sample mean X is an unbiased and consistent estimator for the 
parameter 6. 

2.28. Show that the sample mean X is not a consistent estimator 
for the parameter in the Cauchy model C(0). 

I Hint. Use the property of an arithmetic mean of Cauchy 's distri- 
bution. 

2.29. Estimation of a polynomial distribution. Let a random varia- 
ble | have a finite number of values Oj , , . . , aw with unknown proba- 
bilities pi, . . ., p Nt pi + ... + p N = 1. In order to estimate the 
parameter tf = (pi, . . ., Pn~\), where pn — 1 — p\ — . . . — pw~i, 
we carry out n independent observations on £ . Let v r be the number 
of units in a sample, which are equal to a r , r — I, ..., N. 

(a) Show that the statistics 7> ™ iv/n, /• = 1, . ... N, are unbiased 
and consistent estimators for the parameters pi, . . ., /^v, respectively. 

(b) Describe the class of the parametric functions r(fi) for which 
the unbiased estimators of the form H(Ti , . . ., 7» exist. 

(c) Construct an unbiased arid consistent estimator for the function 

r(S) = 2 dpi. 
i- i 

I Hint. Take into account that ~f{y\, ...,«)• M(n;p,, . . ., p N ) 
and use the solution to Problem 1.52. 

2.30. Estimation by the method of moments. Let X = {Xi, . . ., X„} 
be a sample from the distribution _^(f) € [ Fix; 0), 8 = 
(Si, . . ., 9 r ) 6 91 and suppose thai the moments ar*(0) = E 9 £*, £ = 
I, ..., r, exist. Then, solving the equations a*{6) = Ann. k = 
I, . . ., r, in 0], . . ., r , where A„i, = A„k(X) is the sampling moment 
of kih order, we find the estimates for the parameters using the 
method of moments. 

Using the method of moments, find the estimates for the parameters 
of the gamma distribution r.(8i , Z } and make sure that they are con- 
sistent. 

2.31. Using the method of moments, find the estimates for the 
parameters of a bivariate Potsson distribution given by the proba- 
bilities 

«,- J ,-i(.-.«*.-..s.). 



>r - 0, 1, 2, ..., = (6,, 02), < 6, < $ 2 . 
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This distribution describes, for example, the number of collisions of 
gas molecules in a Wilson's chamber with Ihe particles formed when 
a uranium nucleus is bombarded by neutrons. 

Calculate the estimates for the data obtained in n = 327 observa- 
tions on a random variable ( (n x denotes the number of observations 
where f = x), viz., 



X 
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28 


47 
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«7 


53 


24 


n 


a 


3 


2 


1 



2.32. Simulate samples of sizes « m 10, 400, 1000 from the uniform 
distribution R(0, 0) for 6 = 1 and estimate the parameter using the 
method of moments. 

2J3*. Sampling inspection. Statistical control over the quality of 
products may be carried out in the following way. We choose for con- 
trol n items from a batch of N items at random and without replace- 
ment. Every of the n items is tested. If the number k of defective items 
in the sample meets the condition k, < ko, where ka is a preassigned 
level {k < /i), then these items are replaced by effective ones and the 
batch is accepted. If k > ko, all the N items are inspected and the 
defectives are replaced. We use D to denote the unknown number of 
defectives in the batch (D - 0, 1, . . ., JV), and the random variable 
£ to denote the number or defectives found in this way. Then 

P*><£ = A) =/(*', A »> s C^Cl\\,/C% t k = 0, I, .... ko. 
P D (S - D) = 2 f(k\ D, ») (for D > ko). 

I mki* 1 

Let us estimate a given function r(D) of the number of defectives in 
a batch. Prove that a unique statistic T{£) always exists and is an unbi- 
ased estimator for t{D), i.e., the function T(k) is uniquely defined by 
the conditions 

*° £ 

D = 0, 1, . . ., N. 

Consider the case r(D) = Z>. 
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I Mint. Use Ihe facl that for D < k a the hypergeometric probabili- 

I ties f(k\ D, n) are zero for k > ks>. 

2,34*. Estimation of a finite population. Suppose that we have a 

finite population U = {wi. .... un) of N objects each of which is 

characterized by a quantity x(u), u € U. The values x t = x(m), 

/= 1 N, are unknown, and we are to estimate their sum 

n 
T = 7"(x) = 2 *> We stipulate that every subset (sample) 5 = 

i - I 

(u ilT . . ., u;, v) ) of units of t/ can be observed with a probability p(j). 
Thus, if S — Is] is a set. comprising all the samples, then 

S jj(j) = 1. In this case we say that a sampling plan .Qf= {U, S, P) 

stS 

is defined. The statistics e(s, x) which only depend on x through the 
xi for which u t e s are chosen as the estimators for 7" (i.e., the estimator 
Is a function of the chosen objects and their observed a-valucs). 
A statistic 



is called a Horvilz-Tnompson estimator, where ir(u) = 2 p(0) >s t ' le 

probability that the object w is included in the sample. 

(a) Prove that e{s, x) is an unbiased estimator for T(x), i.e., 

2p(s)e(si x) = T(x) vx <E R N , 

s 

and that there are no other unbiased estimators of the form 
2 o{,u)x[u). 

(b) Derive (he formula for the variance of the Horvilz-Tnompson 
estimator 

U «l fe if 

where t(k> u) ■■ 2 PU) is the probability that the objects u and 

CSltf, ft 

l> are included in the sample. 
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(c) Check whether the statistic 



"^ jr(K>jf(ii) / y(y, u) _ A 
^-J »(«. u) Vt<w)ti» ^ 

is an unbiased estimator for 1>e(s, x). 

(d) Show that the mean and variance of a sample s of size n(s) for 
a sampling plan^= (&, S, P) can be expressed through the probabili- 
ties of inclusion t<«) and ir(w, u) in the following way: 

En(s) = £t(u), 

//('/if. Introduce the indicator random variables y(w) = /(w e s) 
and write n(s} and *(s, x) in the : form 

»(*) = £>("). Hf, x) = 27(«)*<»)/ir<H). 

u u 

2.3S*. (Continued from Problem 2.34.) Consider a sampling plan 
of* m (£/, S, P) which generates unrepealed cquiprobable samples of 
size n. In this case the set S consists of fyv)» ordered combinations 
of size n of various units of U and 

(a) Show that in our case the Horvitz-Thompson estimator has the 
form 

e(s, x) = — / j jc(u). 



n 



(b) We use H 



ft = 



to denote the mean and variance of the population U, respectively. 
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Then — e(s, x) = x (the sample mean of the observed Jf-values) is an 
N 



-<HV 



unbiased estimator for p. Check wheiher D* 
(c) Prove that the statistic 

w — 1 ■ - 



Remark, A stronger result is true, i.e., d 2 (s, x) is an optimum estima- 
tor for cr 2 in the class of all the unbiased quadratic estimators of the 
form 

2 a(u, i>)(*(") - *)(*(») - *)- 

236*. Estimation of the size of a finite population. Suppose that 
we are given a finite population U with an unknown number N of 
elements. We draw from it a simple unrepealed sample of size tn and 
make this operation n times (each time any of the CJ5 possible combi- 
nations of elements of U can be drawn with the same probability). 
We use nr — /*,(«, »i, N) to denote the number of the elements each 
of which appeared exactly r times (/■ = 1, 2. . . . , n). We will estimate 
a parametric function t(N) using the sample (pi, . . ., /tn). 

Prove that in the class of linear statistics W= \ I = TJ / r (tr J an 



KfH 



unbiased estimator for t(N) only exists when t(M is a polynomial 

in \/N of degree Jt < * - I. In this case if r(/0 = £ <7/W, then 
the statistic 



-s 



[S« = P^7j" 



r-l >=1 

is the only unbiased estimator for t(A/)- 

Specifically, — ^ fi r ( r ~ !)«<■ is the only linear unbiased 

nrn{n — 1) J ^ r-t 

r- 1 

estimator for t(/V) = \/N. 

Hint. Represent n, as the sum of indicators, i.e., ft, = fi" + ... 
+ {J?, where £j" = 1, if the fth element of U appeared r times, 
and £, (f> s otherwise = U - - . &)• 
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2.37*. (Continued from Problem 2.36.) Let jj = m + ... + p„ be 
the total number of the observed elements, and let ,Jr* be a class of 
statistics Hiy\). 

Prove that (a) if W s£ mn, then the statistic 

r* = E (-1)' J <$ic?i*TW / 2 (-lr^ctic?)" 

J-m I /s.0 

is an unbiased estimator for any function t{N) in the class Jt\ 

(b) if N is a priori any integer {N^ m), then this statistic is an 
unbiased estimator for the function t(N) under the additional condi- 
tion t(N) m f(N)(C%)~". where f{N) is a polynomial of a degree not 
exceeding mn and satisfying the conditions /(0) = /(l) = . . . 
= /(m- 1) = 0. 

2,38. The Monte Carlo method. When seeking the values of various 
quantities (e.g., defined by equations or integrals), we often use a com- 
putational method based on their probabilistic interpretation and the 
realizations of random trials. This is ihe Monte Carlo method or Che 
me! hod of statistical trials. Depending on the nature of the calculated 
quantity a, we choose a random variable £ so that a = E£ . We simulate 
a sample X = (Xi , . . - , X„) from the distribution -^(£) and use the 
sample mean X to estimate a. Then (see Problem 1.61) we have 

P(vn|jf- a\/S' < o,.) -* 2*(c,) -1 = 7 

as n -* *». Thus, when n is targe, the error in defining a by this method 
does not exceed CyS'/vw" with probability y. 
Suppose that we have to compute the value of the integral 



a= J-..J/(fi, - •-. t r )du..,dtr. 



where u, = |(r t , ...»/,): < ft < 1, / = 1 r\. We obviously can 

take i — /(if], . . ., i)r), where t) ( , . . ., rj r are independent random vari- 
ables uniformly distributed on the segment 10, 1], and simulate a sam- 
ple X using the sequence (1.5). 

1 
Using the Monte Carlo method, estimate the integral a =_ f e' dx, 



given 100 numbers from the sequence (1.5), and compare the resultant 
value of a* with the exact value of a. For what 5 will the relation 
P(|(7 - a* I < S) = 0.99 hold? 

2.39. According to the Monte Carlo method, compute the value of 
the integral 
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p{r, a,, m) = — exp J -- (rf + jr|>{ dxidx 2 



for /■ = 3, oi — 1, W2 = 2, simulating 3 respective sample of size 
n = 100. 

Hint. If £■ , £2 are independent random variables and 

_^(£) = ^r<0, of), y = 1, 2, then 

Pin 01, «> = P(€f + g =e r : ). 

Now use Problem 1.61. 
2.40*. Random walk. A particle starts al the moment / = from 
the point k, < k < A*, and wanders through the integer points of 
the interval [0, Al. If al the moment t the particle was at the point 
/, then at the moment t + 1 it will reach the point / + 1 with the prob- 
ability p or the point / — 1 with the probability q = 1 — p, 
1 ■£ I ^ A r - 1. The walk is stopped at the points or N, where the 
particle is annihilated [5, Chap. XIV], 

(1) Find the probability wi/tt that the particle will be annihilated at 
the point N. 

(2) Compute m k — E-nt, where t* is the time before the annihilation. 

(3) Simulate 100 realizations of the random walk just described for 
N = 1, k = 3, p = 0.6, p = 0.5 and find the estimates for tt*n and m*. 

Hints. (1) Write the finite -difference equation 

/{*) m pf(k + 1) + qf{k - 1), k = 1, . . . , N - 1. 
/<0> = 0. /(A) = 1, 

for f(k) = it kN and make sure that x*/u = (1 - X*)/(l — X*), 
X = q/p, is its only solution if/? ^ q, and mntM = k/N is its only 
solution if p — q — 1/2. 

(2) Write the finite-difference equation 

m* = /wijt+ 1 -I- qm K - 1 + 1, 
k = 1, . . ., A — 1, /«»= »i/v = 0, 

jfc A 
for /M* and make sure that m* = ir*jv is its only solu- 

q - r q - p 

tlon if p >■= q, and m* = A(A — fc) is its only solution if 
p~ q= 1/2. 

(3) Act as in Problem 1.4. 
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Optimum Estimators 

2.41* Prove that an optimum unbiased estimator is always a symmet- 
ric function of observations. 

' Hint. If T = !T(X) is an unbiased estimator for t($), consider a 

I „ /1.../A 

symmetric statistic T* = — V 7"(irX), where ir = [ , I 

is a permutation of n elements, ttX = {X>, , . . . , X^, and sum- 
mation is performed over all the nl permutations. Show that 

Dor* < Do?; 

2.42. Prove the following properties of optimum estimators. If 
T* = T*(X) is an optimum unbiased estimator for a function 
t = r{8), then (I) an inequality cov» (7*, \fi) — Q V8 holds foT any 
Statistic 4, = \KX) wJ th Et& = V» e 9; <2) for any other unbiased 
estimator T = T(X) covo (r* t 7) = Dar*. 

I Mint. In the first case consider unbiased estimators of the form 
I T%. = T* + X& X e#'. In the second case put ^ = T* - T. 

2.43. Check whether the amount of information HO) for the respec- 
tive model has the form given in the following table: 



Model A (9, a 1 ) *i(ji, 6*) r<0, X) C(#) Bi{k, 8) n(fl) Bi(r, 6) 

HB) \/o* 2/e 2 vfl 2 1/2 */[tf(i - e>] 1/0 //[»(i - »f] 



2.44. Prove that the information matrix for the general normal 
model i-Vtfii , 0\) has the form 



1(6) = 






2/tf? 



2.45*. Show that the information matrix for the model in Problem 
2.29 has the form 1(0) = IgyWli"" '; where 8 = [pi, .... p N - 0, 



- ra\ U'pi+I/PN for i=y, 
'^"U* for i*J, *«- l -*» 



- Ptf-t- 



Calculate I"'<«. 

I//1/1/. Write the probabilities f(a r ; ff) - P*(£ = «r> = /v, r => 
I I. . . ., N, in the form 
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N 

/(tfrj S) = LL Pj ' = (I - pi - . . . - PN- l) ' * 

; = ' 

IN- I 
J- 1 
where 5(o,, a,) = 1 for i= j" and 5(ai, a/) — for i & j. 
2A6*. The model ,>"«= (/""Or; 0), #€ 9 J is said to be exponential if 
the function /(x; 9) has the form 

/<* 0) = «xp ll/tMSW + C{0) + DMS- 
Prove that an efficient estimator t* for a parametric function rf,ff) ex- 
ists if and only if the model Ts is exponential and 

if is a scalar* and ■ 

jo I /-I 

if = «?,, .... r ). 

Derive the formulas 



D«T* = 



aA'iO) 
if 9 is a scalar, and 



w .ij]M/«a 



j=i 



7 »j 



if = (9„ .... fl r ). 

(/ft/?'. Use the efficiency test. 
2.47, Prove that for an exponential model with a scalar parameter 
the information function is 

H8) = (cmA'm - cm a- m/A' ($y 

and 

E,s(o = -cm/A'm- 
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I Hint, Compare the expression for Da t* in the previous problem 
I with the Cramer-Rao bound. 
2,48. Show that in the listed regular models the function t($) has 
an efficient estimator r* and variance D s t* as given in the following 
table: 



Model t(0) t* D»t* 



m 

Jft», ah 6 Jfsl ) ,Xi 

ft ■ ■ 



n 



..-■Tip, 6 2 ) & -.^(Jf) - rf 2» 4 /n 



V{d, X) 6 ': ■ \ rfa t „■ 8 2 /Xn 






s<0, i) 1/0 --2_i ln * l/ <«0 



,»/(*, 0) 


e 


S5* 


0(1 - 0)/** 


n<9) 





jr 


0//i 


57{r, d) 


rtf/O - 0) 


A 7 


rfl/|n(l - fl) 2 J 



I //(/»/, Use Problem 2.46, 
2.49. Show that the sample mean A" in a logistic model (see Problem 
2.27) is not an efficient estimator for 0, 
| Hint. Use ihe result of Problem 2.27. 
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2.50. Prove that the estimator T* in Problem 2.13 is optimal. 
Hint. Consider the linear combinations of the form 



H«^*S] 



and use the Bhattacharyya test. 

231. Given a sample X = (X t X„), estimate the function 

riff) = Z in the model r<0, X). Prove that T» = 7~*(X) = 



n 



A' is an optimum unbiased estimator for riff). Compute 
X(X/i + 1) 

D»T* and make sure that this estimator is not efficient. 

\Hint. Use Problems 2.21, 2.43, and the hint to Problem 2.50. 

2.52. Let X = (X\, ..-, X„) be a sample from the distribution 

S'ifii, &f). Apply the Bhattacharyya test and prove that X and S' 2 

(see Problem 2.1 (d)) are optimum unbiased estimators for 0i and S\, 

respectively. Compare the variances of these estimators with the 

respective Cramfir-Rao bounds. 

Hint. In the first case it is sufficient to consider — — — T in the 



second the linear combinations of the form -J- I a(6) — — 
6(8) ^-^ I . Use Problem 2.44. 



tf' 



2.53* Let ~Xy, S{ z and Xi, Si 1 be optimum unbiased estimators For 
the mean and variance of the same normal distribution computed 
from two independent samples of sizes rei and «z, respectively. What 
functions or these statistics are the best estimators for these parameters 
and take into account all the original information? Compare the new 
estimators with the original ones. Which are more exact? 
I Hint Use Problems 2.52 and 2.14. 

2.54* Let X = (X\, . , ., X„) be a sample from an inverse Gauss 
distribution defined by the density 

x > 0, X > 0, ji * 0. 

(1) Show that A" is an optimum unbiased estimator for the parameter 
n whether the parameter X is known or not. Derive the relations 
KXi = p, QA-i = fiVX. 
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(2) Find an efficient estimator for X " ' when ^ is known. 
I Hint. Use Problem 2.46 and the Bhattaeharyya test. 

2.55. Suppose thai we are estimating a differentiate vector function 

t(9) = ( r ,(A) t„(»)), 9 = (fli. . . ., 0,}. Show that for a regular 

model its arbitrary unbiased estimator T = (T,(X), .... T m (X)) obeys 
the information inequality 

D»(T) = |cov,<7;, Tji\T> B-(8)l-\6)R(fi), 

where B(6) = |_JL_|, and the inequality Ai 5s A 2 between the ma- 
trices having ihe same dimensionality implies that the matrix Ai - A 2 
is nonnegative definite. Specifically, lor t(0) = 6 we have 
D f (T) > 1- \6). 

!Hint. Consider an arbitrary linear combination ciT ; (fl) + . . , 
+ c„T m (B) = c ' t(0) whose unbiased estimator is c ' T and apply 
the Cramer- Rao inequality for scalar estimators. 

2.56. Show that if an efficient estimator exists for a Function r(&), 
then it is a sufficient sialistic. Thus, a sufficient statistic always exists 
for regular exponential models (see Problem 2.46) and has the form 

nX) = 2 R{Xb (which also follows from the factorization test). 

la I 

R 

2.57. Prove the completeness of the sufficient statistic T n = JJ Xj 

i ■ 1 

for a binomial model Bi(k, 0) (see Problem 2.48). Show that in this 
case unbiased estimators only exist for the polynomials 

r 

t(Q) - 2 a fi' °* degree r < kn, and then the optimum estimator is 

r 
j. o 

Compare this result with Problems 2.5, 2,7, and 2.8. 

I Hint. Use the reproducibility or the distribution Bi(k, 6) (see 
I Problem 1.39 (3». 

It 

2.58. Prove the completeness of the sufficient statistic r» = 2 -** 

i= i 
for Poisson's model TI(0) (see Problem 2.48). Show that the statistic 

r* = ^OjiT^j/n 1 is an optimum estimator for the power series 
J 

T W = 2 a J^ w hich is convergent for all 8 > 0. 
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I Hint. Use the reproducibility of Ti(6) (see Problem 1.39 (4) and 
I Problem 2.9). 
239. (Continued from Problem 2.58.) Construct optimum estima- 
tors for the functions 

t(0) = e ,fe -'\ w*m = e-"0Vfr!, * = 0, 1 

and 

T>(fl> = P«(£> r). r = 1, 2 

2.60*. Let X = (Jfi, , . ., A^) be a sample from a power series distri- 
bution defined by the probabilities 

fix-, $ m a {xw//m, 

x = !, I + 1 /<0) = S "t-^ 1 . * e 6, 

where 9 = (0, /J) and /? > is the convergence radius of the series 

(1) Show that in this model an efficient estimator only^ exists for 
the function r«J) = 9/'(0)//(fl) and has the form i* = X. 

n 

(2) Prove that T n = ^ Xi is a. complete sufficient statistic with 

i - i 
the distribution 

P»(7-„ = -= S'Mf )//"<«. t > ni, 

where &„(') = coef^,/"^). 

(3) Make sure that the statistic 

*5 - & n (r„ - i)6„- , (7-„>/(r„ > »/ + s) 

is an Optimum estimator for the function r s {$) = 0' for any ff = 1, 
2, . . ., and 

r*(5) = b„-dT„ - 5>* n _ l (3" n )/(T„ > (it - 1)/ + s) 

is an optimum estimator for 1 t(s; ff) — S'/f{6). 

(4) Construct an optimum estimator for the function t(#) = 

m 

S '^fr't where the power series is convergent on ©. Show that 

J-r 

f* = b„+ i(T„)i>^ y (T n )l{T n > (n + 1)/) is an optimum estimator for 
the function fifi). 

I Hints. (1) Apply the efficiency lest for an exponential model (see 

I Problem 2.46). 
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(2) Use the generating function 



viz: « - S zVlx: By = fW)/f(8). 



k - tit 

(3) Hike into account the relation 2 a{J)b n {k —J) =A n + j(fr) 

J-t 
for * 3s {n + 1)/. 

2.61. Given a sample X = (Jf,, . . ., X K ) from Poisson's distribution 
truncated at zero (see Problem 2.10), show that an optimum estimator 
for r<0) = 8 is the statistic r* = TA"0 T ^ , /A"0 T for r = Xt + ... 
+ A"„ 3* * + l, and the statistic ** = for T - *, where 4"0* = 

i: (-ir-^r*. 

r-0 

I //in/. Use Problem 2.60, 

1-62. Given a sample X = (X t . ..., X„) from the distribution 

Bi(r, 0y, construct optimum estimators for n(8) = 8 s for integers > I, 
n(0) => F»(£ = 0) = (I - 0)', and t,(i?) = fl*(] - $y>, s > 0, j < rn. 

1 Hint. Use the solution to Problem 2.60 and the hint to Problem 

J 2.11. 
2.63*. Consider a model with a finite number /V of possible out- 
comes and unknown probabilities p\ , ...,/?* of the outcomes (see 

Problem 2.29). Show that T ■ (v y „ s i) is a minimal complete 

sufficient statistic. Prove that unbiased estimators only exist in this 

model Tor the polynomials in />i p N of a degree smaller than 

or equal to n and find the explicit form of these estimators. 

I Him. Use the test for an r- parametric exponential family and 
Problems 2.29, 2.45, and 1.52 (b). 

2.64. Prove that the estimators in Problems 2.13 and 2.16 are 
optimal, 

J Hint. Use the property of complete sufficient statistics. 

2.65. Let X = (X, , . . . , X„) be a sample from the distribution 
-/•«) m **W, a 2 ). 

(1) Construct an optimum estimator for t(0) = P s (f ^ xt,), where 
Xo is a given number. 

(2) Show that an optimum estimator^ for the theoretical density 
fAxi 0) has the form /»(*} =/^(jt, X), where a 2 , = {n - l)o 2 /n. 
Specifically, it follows that if r(0; a*) = E»<p(Q, then its optimum esti- 

CO 

mator is t* = j <p(x)f*(x)dx = r(X; a 1 ). Apply this result to the 
functions v>iW = e"*, <pi(x) = x 2 , <pAx) = I(x $ x a ). 
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Hints. (1) Consider the unbiased estimator 7"i = I(Xi ^ xq) and 
use the Rao-Blackwell-Kolmogorov theorem (see the solution lo 
Problems 2.64 and 1.56). 

(2) Check whether f*(x) satisfies the unbiasedness equation. 
2.*6. Using the test for an r- parametric exponential family, verify 

that the pairs (x, 2 xfi and (x, S 2 ) are minimal complete suffi- 
cient siau'stic for the model, /(fl], 9f). Prove the optimality of the 
estimators in Problem 2.20 (compare with Problem 2.52). 

I Hint. Consider the new parameters 6{ = di/&h H m -1/(202)- 

2.67. Show that the pair T 4 (X, S 2 ) is a sufficient statistic for the 
model . V{9, y 2 2 ) but this statistic is incomplete. 

| Hint. Consider the function ,p(jy = {n + v^sVK/i - 1)7 3 ) " ' - 
I A 5 and calculate its mean, 

2.68. Given « S 2 independent measurements of the diameter 6 t of 
a circle, construct an optimum unbiased estimate for its area. 

j Hint. Assume that the measurement errors are ^{0, vi)-normai 

1 random variables and use Problem 2.66. 

2*9*. Prove the following assertion (Basu theorem): if a complete 
sufficient statistic T exists for the model f7~= \F(x; &), & e ©) and the 
statistic T\ has a distribution independent of the parameter 8, then 
7*i and T are independent. 

I Hint. Show that for any event A the conditional Fn(T"i e A\T) 
and unconditional ¥</(Ti$ A) probabilities coincide. 
2.70*. Let (Jfi, X*, Xi) be a sample from the distribution 
j^(f) = ^(0, 9 2 ). Construct an optimum estimator for t(9) = 

Ps« < Xo). 

I Hint. See the hint to Problems 2.65. 2.69, and the solution to 
1 Problem 1.58. 
2.71*. Let X m {Xi, .... X„) be a sample from the distribution 

■^Ci, <?!)■ Prove that the statistics T = (X, S 2 ) and 



») 



-(^ 



are independent. 



I Hint. Show that the distribution of U is independent of 
I tf = (6i, 62) and apply the Basu theorem (see Problem 2.69). 
2.72'*. Given a sample X = (Xi, ..., X„) from rhe distribution 
t'(8i , Sf), construct an optimum unbiased estimator for the function 



T(ff> = P,« *Z Xo) 



-<^) 
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Hint. Consider an unbiased e5ti"mator_7[ = /(Jfi :£ x ) and cal- 
culate H{J) = E»<r,|T), where T = {X. S 1 ). Use Problems 2,71 
and 1.58. 

2.73. Make sure that the estimators in Problem 2.21 are optimal. 
Show that unbiased estimators for t b (9) =9~ a do not exist when 
a — Xn ^ is an integer. 

\Hint. Use the completeness of a sufficient statistic T. 

2.74*. Let X = (Xi X„) be a sample from the distribution 

P(#> X), 7* = A"i 4- ... + X K , and <p(x) be a given function for which 
r(0) = Ee<p(,i) exists. Prove that an optimum estimator for t(0) has 
the form 

i 



row) 



r(x)rf>(/7 - i)) 





VixTU 



'{I - x)<"- m -*dx. 



Derive the results of Problems 2.48, 2.51, and 2.73. 
\Hint. Prove that Est* ■ 7(0). 
2.7S*. (Continued from Problem 2.74.) Check whether the statistic 

t' = [1 - B(t/T; X, X(n - i))U(T ^ t) 

is an optimum estimator for the reliability function t(6\ t) — 
P»(? ^ t), where B(*; a, b) is the function of a beta distribution 
8{a, 6). Specifically, for the distribution TCfl, 1) r(0; I) = e _f/ * 
and t* = (1 - r/r) B ~ l /(7 1 & i /). 

I Hint. Use Problem 2.74 assuming that <e(x) = /(* > /). 

2.T6*. Prove that T = T(X) = 2j X} is a complete sufficient 

statistic for Weibull's distribution ^(0, X, 8) with an unknown scale 
parameter 6, and the optimum estimator for r(0> =• E« *>(£), where <fi(x) 
is a given function, has the form 

i 
«**»(*- 4) J sp((r7) lA )(l - /)"- 2 dr. 

Specifically, we have E s £ x = fl\ and therefore 77 n is an optimum esti- 
mator for 9\ 

2.77. Show that the pair T = (X^y, X) is a sufficient statistic for 
the bivariate exponential distribution 'W($t, 1, 0*). Construct unbiased 
estimators of the form aA , ( 1) + I3X for the unknown parameters of 
the model and compute their variances. 

Remark. Since T is a complete statistic, the respective estimators 
are optimal. 
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Hint, Apply the factorization test and use the solution to 
Problem 1.34 taking into account that Ja \ — - — 1 = T(l, 1). 



2.7IL Let the observable random variable £ have a range \a(&), b], 
where a(0) is a given monotone function of ff. Show that the minimal 
value of the sample X&} is a sufficient statistic for 6 if and only if 
the distribution density ftix; 8} has the form fti^l ff) = 8i.x)/h{ff). 
a(8) ^ x < b. This result is also true for the statistic X in > if the range 
is [a, 6(9)1, where biff) is a given monotone function of ff. 
I Mint. Apply the factorization test. 

2.79. Let X = {Xi , . . ., X„) be a sample from the distribution ^(0, ff). 
Prove that Xin) ■ max X, is a complete sufficient statistic for 

$. Prove that T* = X w is an optimum estimator for $ and, 

ft 

generally, r* ■ r(A F <„>) + — X M r' (X(j,)} is an optimum estimator for 
n 

an arbitrary different! able function t(6). Consider the class of statistics 

'/',, = MP" and show that it contains estimates with smaller standard 

error than that of 7"*. 

t Hint. Use Problem 2.24. 

2.80. Prove the completeness of a sufficient statistic T = (A^i), Xw) 
for the model R(ffi, 62). Make sure that the estimators in Problem 2.25 
are optimal. Construct optimum estimators for ff t and ffz. 

I Hint. Use Problem 1.36. 

2-81*. Show that T = (A*u), X m ) is a sufficient statistic for the 
model" /f(a(P), b{ff)), where a(ff) < b(6) Vfl are given continuous func- 
tions of the scalar parameter ff. Find the conditions for a univariate 
sufficient statistic to exist and establish its form. Verify that 
max (|Jf<»|, \Xirf\) is a sufficient statistic for the model R{-6. ff), and 
T is a minimal sufficient statistic for the models R(ff, ff ■+■ 1) and 
R(d, 20). 

2.82*. Suppose that we have made one observation Jona discrete 
random variable distributed as 

{fi*(l - fl) 2 for * = 0, I, 2, .... 

Show that A" is a boundedly complete sufficient statistic 

I Hint Solve the unbiasedness equation E*<p(X) — Vfl in the class 
of all functions and the subclass of bounded functions. 
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2.83* Estimation of the size of a finite population. Let m = 1 in 
Problems 2.36 and 2.37, Prove that the random variable >> is a complete 
sufficient statistic and. consequently, the estimators in Problem 2.37 
are optimal. 

Remark. This result is true for arbitrary m. 



Maximum Likelihood Estimates 

2.84*. Show that if an efficient estimator t* exists for a differentiable 
parametric function r(fl) in a regular model, then the m.l.e. On for the 
parameter 8 is uniquely defined by the equation t(8) ts t*- Apply this 
result to find ft, for the models of Problem 2.48. 

I Hint. Use the efficiency test and show that ^— < 

| &6 1 \e-4. 

| {assume that the likelihood function L = L{\', 0) is twice differen- 

I liable with respect to 0). 

2.85. Compute the asymptotic efficiency of the sample median 

T " = X iWi\ * n wnich is arl estimator for the mean 8 in the model 



Him, Use Problem 1 ,32 on the asymptotic normality of sample 
| quantiles. 

2.86. Prove that_ln the general normal model .//'(fli , 6>f) the m.l.e. 
ft, - (ft., &*) - iX t S). 

| Hint. Write the likelihood equations and solve them. 

2.87. (Continued from Problem 2,86.) Show that r, = * ( — ^ — J 

is a m.I.e. for the function t(0) = * ( -^-r — - 1 (see Problem 2.72). 

Find an asymptotic distribution for ?. as n -+ qd. 

[ Hint. Use the invariance of m.l.e!s and the assertion that they 
I are asymptotically normal. 

2.88. Prove that the m.l.e. ft, for the parameter 8 in the model 
~*e(/i, S l ) is asymptotically unbiased and consistent (compare with 
Problem 2,16), Investigate its limiting distribution as n «* », Calculate 
the asymptotic efficiency of the m.Le, in Problem 2,15. 

I Hint. Use Problems 2.43 and 2.84, 

2.89. Let X = (Jfi, . »., X*) be a sample from the distribution 
■sPiH, 28). Find the m.l.e. ft, and prove its consistency. 

2.90. Given a sample ((Jf,, Yi), . .., (X„, Y„)) from the bivariate 
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normal distribution „v( (0, 0), j° , C 2 ) with unknown a 1 > 

and g * ( — •• 1). construct the m.l.eJs a 1 and g. 

j Hwt Co over to new parameters q = (Qfi, g 2 ) putting <7i = 

1 q\m ■ - — j- j-, «2 = « 2 (ff) = -j— ^ j- (here 8 = 

2^(1 - Q l ) (^(1 _ p*) 

1 (a 2 , g)) and use the invariance of the m.I.c. 
2.91*. Given a sample ((Jfj, y ( ), . . ,, (X„, Y„)) from the bivariaie 

normal distribution ..V MO, 0), r * \,fl€(-l, 1), write the likeli- 
hood equation to find fti and calculate its asymptotic variance. 
2.92. (Continued from Problem 2.9l.) Assume that the sample corre- 
I " 
lation coefficient T„ = — ^ X t Y, is an estimator for $ and compute 

n it- I 

its asymptotic efficiency. 

| Hint. Use the characteristic function to find the moments. 
2.93*. Let X = {Xi , , . . , X„) be a sample from the A-variate normal 
distribution ^f(jt, £) with unknown p = (pi, . . . , **) and E = Idyjf, 
|E| ^ 0, i.e., Xi = (Aji, . . . , Xi*), I — 1, ...,«, are independent varia- 
bles with the density 

/(xi * = R^w^ «•> [4 (x - * rE ~ lcx - *>]• 

x => fci, ...,**), 9 = (p. E). We write X - (x,, .... J?*), where 

JG= — 2 -*"> £- &/!*, Wl 't n the sample covariance S„ = 
" j- i 

- S Mfo — XfiiXy - Xj) which corresponds to the theoretical 
« (-i 
covariance try so that 

^-^S X '. S = E(X) ■ 1 ^(Xf - X)(X/ - X)'. 

(1) Prove that X and E are m.l.els for the parameters p and E, respec- 
tively. 

(2) Make sure that — _ E is an unbiased estimate for E. 

Fl wm 1 

(3) Obtain the expression 

maxZ.(x; 6) = L(x; s, E(x» . (2^)* *" /I |£(x)|-" /1 

for the maximum of the likelihood function. 
5- 
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Hint. Reduce the likelihood function to the form 
L(x; 0) = [{lirfWy" 1 

and use Problem 2.4. 
2.94*. Let X = (Xi , . . . , AVJ be a sample from a tognormal distribu- 
tion, ix., Xi = c ', where ^f( Yd m si-X& t , 0f). Construct the m.l.c's 
for the functions ri(ff) = E<.A"i and tz(P) = D a X\. ComputeE S 7 1 i n and 
show that the estimate hn is asymptotically unbiased. 
\Hint. Ust the invariance of the mJ.e. 
2.95- Kapteyn's distribution. This distribution is defined by the 
density 

/(-*; 9) = ^^- exp f - ' is(x) - e,)M , 9 = <*„ #d, 
V2^e 2 (, 2P| J 

where #(*) is a diffcrentiable monotone increasing function. Show that 
the following generalization of the result of Problem 2.86 is true, i,e., the 

a n 

m.Le. e„ - (g, 7), where g = i>]*<*& r2 =" -5*](g(*i) - S) 2 - 

,f « I C-J 

Is g an efficient estimate for 0j? Show that the statistic 

n 

i- i 

is an efficient estimate for S| when 0] = tr, where a is known (compare 
with the respective results for the normal model in Problem 2.48). 
I Him. Use Problem 2.46. 

2.96. Suppose that a random variable {has a power series distribu- 
tion (see Problem 2,60). Show that here_the likelihood equation for 
finding the m.l.e, §n has the form n(8) = X, where nid) = E*£. Calcu- 
late the asymptotic variance_of (?„. Apply the results to estimate the 
parameter 8 in the model 8i(r, 0). 

2.97. Write the accumulation method equations to compute approx- 
imately the m.I.e. ft, for the parameter 6 of Poisson's distribution trun- 
cated at zero (see Problem 2.10). 

\Hlnt. Use the solution to Problem 2.96. 

2.98. Suppose that in a polynomial distribution M (n; p\ , . . . , pn) 
the probabilities of the outcomes are pi - pii&X i — I, . . ., N, where 
9 is an unknown scalar parameter. Write the accumulation method 
equations for an approximate calculation of the m.l.e; $ m . 
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2.99. Given a sample X = (Afi, . . ., X„) t estimate the parameter 6 
of the Cauchy model C(fi). Using the accumulation method, write the 
equations to calculate the m.l.e, 8„ approximately. Use the sample me- 
dian 7*ii = X it i . as an estimator for 8 and compute its asymp- 

totic efficiency. 

| Hint. Use Problems 2,43 and 1.32. 

2.100. Let X = (Jfi, . . ., X„) be a sample from the uniform distribu- 
tion /J{0, 8). Show that here the m.l.e. 8„ - X^, make sure that it 
is consistent, and find its limiting distribution as n *-* ». 

| W/if. Use Problems 2.24 and 2.79. 

2.101. Show that in the model R{8 - 1/2, 8 + 1/2) any £ 
[A'tn) - 1/2, A"(D + 1/2) is a m.l.e. &,. What point of this interval is 
an unbiased estimate for 81 

! Hint. Use the solutions to Problems 2.80 and 1.36. 

2.102. Show that for the shift parameter 8 in Weibull's distribution 
W(0, a, b), < a !g 1, the m.l.e. $„ is A")!), prove that it is consistent, 
and find it's limiting distribution as n -* <=, 

I Hint. Use the solutions to Problems 1.37 and 2.26. 

2.103. A random variable | which describes the lifetimes of the ele- 
ments of an electronic device has Rayleigh's distribution IV(0, 2, -J8) 
with the density /(x; 8} = (2jr/0)e~" rl/ \ x ^ 0. Given a sample 
X = (Xi , .... X„), construct the m.U. 9, (compare with Problem 
2.76). 

2.104. Given a sample X = (X y , . . ,, JQ from the distribution_T(0, X), 
estimate the function t(0> = l/ft Show that the m.l.e. f„ = WX. Make 
sure that the estimate is consistent and find its limiting distribution 
as n -• oo. 

| Hint. Use Problems 2.21, 2.43, and 2.84. 
2.105*. Prove that for the Laplace distribution defined by the density 

f(x; 8) - — e-l*-»l, Jtt R y the m.l.e. 8„ coincides with the sample me- 

2 
dian. Can we use here the theorem on the asymptotic normality of 
the m.!*? 

2.106*. Let X = {Xi, . . -t X„) be a sanipie fromjthe distribution 
.-r{$, l). Then (see Problem 2.84) 0„ = X and -d(X) = Jt& t \/ri). 
Take the statistic 

T„ = X~l{\X\ > a„) + bXI{\X\ < a„) 

as an estimator for 8, where a„ -» 0, but ifna a -» oo as n -> <x>, and 
calculate its asymptotic efficiency. 
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2.107. Give examples of the m.Le,. 6„ for which D»rf fl = e>(» "'). 

I Hint. Consider !he model Rf,0, ft (see Problem 2.100) and 
| Weibull's modeJ (see Problems 2,102 and 1.37). 

2.108. Estimate the function t(0> = ~ ' in the model 11(0) and make 
sure that the m.l.e. f„ has no finite moments for any n but, neverthe- 
less, its asymptotic variance exists and is equal to {6*n)~ y . 

| Hint. Use Problems 2.84, 1.39, and 2,43. 
2.109*. Variance-stabilizing transformations. For the models 
Bi(k, 0), nrfl), -^(jt, B z ), and r(tf, X) find the parametric functions 
r(6) such that the asymptotic variances of the respective m.ljeJs f„ are 
independent of the parameter A 
I Hint. Use Problem 2.43. 
2.110. Simulate samples of sizes n =■ 10, n = 100, n = 1000 and ob- 
tain the mJ-els for the parameters of the following distributions: 

(1) ^XSu 61) with 0i = 1, $1 = 4; 

(2) BiQ, 8) with 6 = 0.7; 

(3) RiO, 6) with $ = 1. 

| Hint. Use Problems 2.86, 2.84, and 2.100, respectively, 
2.111*. Estimation of (he size of a finite population. Under the con- 
ditions of Problem 2.83 show that the mJ.e. /? for an unknown 
parameter of the population N can be found in a unique way (for 
n > 1) from the condition 

S(N, n) < n< S{N - 1. nh 

Af + I / N + I 
where S(N, k)= In ' * , / In „ ' for JV ^ * > 1, 
/v + 1 — * / N 

S(k - I, *)- «e. ir i) - 1, then #=s 1. 

Find the values of ij for which iQ — tt. Assume that n, /V -» qq, 

< ao ^ or = — — < ori < ao, where «o, ai are constants, and find 
/v + I 

an approximate expression for the m.l.e. a — n/(i^ + 1). Generalize 

the result to the case of an arbitrary m. 

2.112. (Continued from Problem 2.111.) To estimate the unknown 
number N of fish in a lake, we carry out the following experiment. 
Al the first stage we catch m\ fish at random and without replacement, 
ring them, and release them to the lake. Using the same scheme at 
the second stage, we catch m 2 fish and register the number m of ringed 
fish (so that the total number of fish caught during the two stages 
is ij = trtt + m% — Hi). Show thai given #z, the m.1.e, ft is defined by 



tf. p^pl 



where [■) is an integer part. Compare this result for 
m\ - mz with that obtained in Problem 2.36. 
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t Hint. Take into account that the statistic m has a hypergeometric 

1 distribution Mimi, N, mi). 
2.113*. Sampling inspection. Suppose we have a batch of S 
products, which contains an (unknown) number D of defective items. 
In order to estimate Hora given function t{D), we draw n (n < N) 
items from the batch at random and without replacement. Each item 
is subjected to quality control. Let Xi ■> 1 if the ith tested item is defec- 
tive, and Xi — otherwise, (' = 1 n. 

(1) Show that d„ - Xi + ... + X„ (the total number of defectives 
in the sample X = (Jf,, . . . , X„)) is a complete sufficient statistic for 
D and has a hypergeometric distribution H{D, N, n). Having proved 
this, make sure that unbiased estimates exist only if t{D) is a poly- 
nomial of a degree no greater 1 than n. In this case if r(D} = S a j(P)J> 

j-0 

(D)j - D(D - \),..{D - j + 1), (^)o - 1, then the statistic 



j-0 

is an optimum unbiased estimator for rid). 

(2) Find the explicit form of optimum estimators for the functions 
tl(£)) = D and n(D) = D{N — D), which are, up to the multipliers, 
the mean and variance, respectively, of the statistic d a (sec Sec 1.6). 

(3) Make sure that the m.l.e. is &n = KJV + l)d a /n]. 

I Hint. Use the solution to Problem 2.33 and the formulas for the 

I moments of H{D, N, n). 
2,114. Grouping of statistical dam. Let X/ = (Aj, , . , X Jn ), j = 
I,,.., k, be independent samples Jrom the respective distributions 

A'ififi, #!). j - 1 Ar, and let X Jt SJ = S*(X,) be the_respective 

sample means and variances. Prove that 4 = (Xt. . , ■ , X*, 0i) is a 



m.l.e. for 6 - (Bit. ...,S k i, e 2 ). where B\ = 
and that the statistic 

_ m + ... + n k 



m + ... + n* 
J 



^"jSf, 



§1 = — - — 1 y^tijSj 

* n. -L J- « t — If ^-^ 



m +...+«*- * ' «j + ... + h* 

is an unbiased estimator for the common variance flf. 
I Hint. Use the solution to Problem 2.86. 
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Confidence Estimation 

2.115. Show that given a sample X = (Xi, . . ., X„), the 7-confidence 
interval for the parameter 8 in the model .//'(0, 2 ), > 0, has the form 
{X/{1 + c T /vn), X/(\ - c T /vn))*. Find a simitar solution for the 
model sV(8, 8\ $ < 0. 

I Hint. Use the fact that ~ai.(X - 0)vn/0) = .F(0, 1). 

2.116. Let X = (X,, ..., X„) be a sample from the distribution 

<1) Show that any interval of the form A,(X) = ( X~ - — &, 

— <7 \ \ V9 

X = gi J , where gj < g 2 are any numbers satisfying the condition 

*(«) - *Cgi) = 7, is a 7- confidence interval for the parameter ft 
Prove that A y (\) ■> 1 X ± -^ c y 1 is the shortest interval among 

the 7-confidence intervals. 

(2) How many observations should be made (n = n(l, y)) for the 
precision of the localization parameter to be equal to / for a given 
confidence level 7? Calculate n(i, 7) for T = 0.99, / = 0.5, and / = 0.1 
(o = 1). How docs the confidence level y change depending on / and 
nl 

1 Hint. Use the central statistic G(X; 0) = — (x - 8). 

a 

2.117. Prove that the 7-confidence interval for the mean-square 
deviation of 8 in the model ^'{p., B z ) is any interval $y(X) = 

{T/ai, T/ctj), where T 1 = 2 ( x > — P-f < and the numbers ct\ < tu are 
/- i 
ai 

chosen from the condition { xk„(_x 2 )dx - 7/2, where k„(t) is the 

density of the distribution x 2 (n)- Define the shortest interval 5*(X) 
in this class. 

\Hint. Use the fact that ^4(T^/8 2 ) = x : («). 

2.118. (Continued from Problem 2.117.) Show that the central 7- 
confidence interval for the variance of 2 has the form 

A V (X) = (7*/g t , TV*,), g t ± x«-,>/*.«, S* - X<. + ,>/2„,, 



• Recall that t, = n,,,^ = *- 
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while the interval &£{X) = (7? 2 , 7$ z ) is the shortest among these jnter- 

Si 

vals, where the numbers gi < gi satisfy the condition f k„(t) dt = 7 

*i 
(see the solution to the previous problem). 

2.119. Given a sample X = (X%, ..., X„) from the distribution 
• H9t, St), construct the one- and two-sided 7-confidence intervals for 
the mean &j. 

Hint. Use the assertion 



^(^rn^h^ -so,- d. 



2.1Z0. Given a sample X = (X[ X„) from the distribution 

-i'X&i, 9%), construct the one- and two-sided 7-confidence intervals for 
the variance r = #*. 

j Hint. Use Fisher's theorem. 

2.121. Given the realization (2.96, 3.07, 3.02, 2.98, 3,06) of a sample 
of size n = 5 from a normal distribution with unknown parameters, 
construct 95 <7o confidence intervals for the mean and variance. 

2.122. Let X = {Xt, ■■-, X a ) and Y = ( r, Y„) be two in- 
dependent samples, the first from the distribution ^($ w , a 2 ), and 
the second from ~^(5 <2 \ oh- Construct a 7-confidence interval for the 
difference r - 9 iiy — G <2> of the means. 

Hint. Show that S((X - Y- i)/o) = Jl\Q, 1), c- 2 = — + — . 

n m 

2.123- (Continued from Problem 2.122,) In contrast to the previous 

ease, all the observations have the same unknown variance 2 , i.e., 

~f{Xi) * ^r(6i'\ ffl), -jfXYj) = ^{e\ z \ flf). Estimate the difference 

7 = 0j" — Si 2 ' of the means. Consider a more general situation when 

the variances are unknown but only differ by a known factor, i,&, 

-A.Xt) = -*t»P\ c0f), S{Yj) = ^W*, el), where c is known. 

Hint. Show that the random variable 



-4 



mn(m +■ n - 2) X ~ Y - t 



m + * VnS 2 (X) + wS 2 (Y) 



has Student's distribution S(m + n - 2). 
2.124. TVvo measurements at the same points of an angle gave (in 
degrees) 20.76 and 20.98. Six more such measurements of the same 
angle made by another device gave 21.64, 21.54, 22.32, 20.56, 21.43, 
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21.07. We assume that random measurement errors are normally dis- 
tributed and that the first device is less accurate (the respective vari- 
ance is four times that produced by the second device). Calculate a 
95% confidence interval for the difference of the systematic errors due 
to (he use of these devices. 

1 Hint. Use the solution to Problem 2.123. 
2.125. (Continued from Problem 2.123.) Suppose that the samples 

have different variances, i.e., _A*i) = ^0\ l} , Bi t}1 ), -AY/) = 

-^("P'> Sj 2 ' 1 ). Construct a confidence interval for the ratio 

T=tfi 1>1 /ei 2)1 . 

Hint. Show that the central statistic here has the form 



_ njm - 1) S 2 (X) / 



m{n - 1) s*(Y> 



2.126. Two laboratories measured the concentration (in °fa) of sul- 
phur in a standard sample of diesel fuel. Six independent measure- 
ments in the firs! laboratory gave 0.869, 0.874, 0.867, 0.87S, 0.870, 
0.869. Five similar measurements in the second laboratory gave 0.865, 
0.870, 0.866, 0.871, 0.868. Under the assumption that the measurement 
errors are normally distributed, construct a 90% confidence interval 
for the ratio of the measurements of variances in the two laboratories. 
If there are grounds to assume that the variances are the same, con- 
struct a similar interval for the difference of systematic errors in both 
laboratories. 

2.127. Let X - (X„ . . ., X*) and Y = (Y,, .... Y m ) be samples 
from the distributions r(0i, 1) and r(0 2 , 1), respectively. Construct 
a central ^-confidence interval for the ratio r = S^/Bi. 

| Hint. Use Problem 1.51. 

2.128. Make sure that ( X m + — fl - T * , X m J , where X w = 

min Xi, is a -y- confidence interval for the parameter 6 of an ex- 

ponential distribution with the density /(jc; 6) = e~ ir ~ () , x 3s *• What 
is the central 7-confidence interval? 

I Hint. Find the distribution of the statistic X w and take into ac- 

] count that {JQi> ^ 9} is a trite event. 

2.129. Given a sample of size n, make sure that {X w , A^/Vl — y) 
is a ■y-confidence interval for the parameter 8 in the model /?(0, ff) 
(as in Problem 2.128). 

I Hint. Show that _4«Jr„ I ,/fl> ft ) = R(G, 1) (see Problem 1.35). 



2. Estimation of Distribution Parameters 75 

2.130. Consider the model W(Q, X, 0) {see Problem 2.76). Show that 
the interval (27?x?U 1 »/i,a B , 2T/x\\-,)n.i«) is a central 7-confidence 
interval for the function t(0) - 0\ Specifically, for X = 1 we have the 
solution for an exponential model r<0, 1). 
\Hint. Use the solution to Problem 2.76. 
1.131*. Show that the 7-confidence region for the parameters 
(•1, r = 0|) in the general normal model ^(flj, &l) found from the 
sample X = (X,, ..., X„) has the form 

,«X) - ((0i ; t): t > n(X - 0i) 2 /c*„ 

where 7171 = 7. 

I Hint. Use Fisher's theorem. 
2.132*. Let {Xi = (Xn, Xa\ i = 1, ...,«) be a sample from the 
bivariate normal distribution 

-"(>•«■ H--.w"D- ~' <e< '- 

with a known matrix E, Using Problem 1.59, construct the 7- 
confidcnce region for = (0i, 0z). 

2.133. Let X = (Xt^. . ,, X n ) be a sample from BUI, 8). Using a 
point estimator T = X for the parameter 0, show that the central 7- 
confidence interval (7">, 7j) for it is defined by the conditions 

„ tiT 

2^(1 - to'-' = 2]c;na - 7VV - ' = ^-^ ; 

here T x = Z A-^ ; nT, n - «r + l Y 7* = Z (^J* : *& + 1, 

/I - f(7" J , where Z <p; o, A) is a p-quantile of the beta distribution 

B{a, b). Construct an approximate 7-confidence interval for for large 
n. 

I Hint. Use Problems 1.39 (3). 2.43, and 2.84. 

2.134. Given a sample X = (Xi, .,., X n ) from the Bernoulli model 
Bi(l, 0), construct an asymptotic (for n -» eo) 7-confidence jnterval 
f or b ased on the normal approximation _4(vn(Jf - 0)/ 
^0(1 _ #)) ~ ^"(0, 1) (the De Moivre-Laplace theorem). Compare 
the resultant solution with that based on the asymptotic properties 
of maximum likelihood estimates (see Problem 2.133). 
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2.135. (Continued from Problem 2.134.) Show that 

( arcsin \j X ± — £= ) is an asymptotic 7-confidence interval for the 
\ 2v«/ 

function t(0) = arcsin \/0 and then find an approximate confidence 
interval for S. 

I Hint. Use Problem 2.109. 

2.136. In 540 Bernoulli trials a positive result was observed 216 
times. Calculate a 95% confidence interval far the variance of the 
number of positive outcomes. 

2.137. Let X = (A"i, .. ., X n ) be a sample from the distribution 
n(0). Using the point estimator T — X, show that the central -y- 
confidence interval (7"i, 7i) for 8 is defined by the conditions 
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An approximate 

■^confidence interval for large n is \ (X ± c^VjC/n). 

2.138. Construct an asymptotic 7-confidencc interval for the 
parameterg_of Poisson's model n(ff). Use the norml approximation 
-jg(2v7t<V]r - i/fl)) - /^(0, 1) (see Problem 2.109) or the approxima- 

Hn(X- 0) \ _ v ^ y) ((he Centra! Umi , Theorem j_ Com . 

pare the result with that of the previous problem. 

2.139. Independent random variables^ and X 2 have Poisson's dis- 
tribution with the parameters Xi and Xi, respectively. Suppose that we 
know their sum X, + Xz = n. Construct a confidence interval for 
8 = X[/(Ai + Xj) given an observation on Xi. 

I Mint. Find the conditional distribution S{X\ \X^ + JTj = n) (see 
1 Problem 1.54) and use the solution to Problem 2.133. 

2.140. Construct an asymptotic 7-conftdence interval for the 
parameter & of a power series distribution (see Problem 2.60). Use the 
result to estimate the parameter 8 in the model Bi(r\ 0). 

I Hint. Use Problem 2.96 and its solution. 

2.141. Construct an asymptotic -^-confidence interval for the 
parameter 9 in the model r«J, X). 

I Hini. Use Problems 2.43 and 2.48, and the approximation in 
Problem 2.109, 
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2.142. Construct an asymptotic 7-confidence interval for the 
parameter 6 in the model -^0*» fl 1 ). 

I Hint. Use the approximation Jg.(v2ff(lri 6„ - In &)} ~ ~4'(Q, I) in 
1 Problem 2.109. 

2.143. Construct an asymptotic 7- confide nee interval for the func- 
tion r(ff> = * (^r^- ] m the model ■#*.*,, <?!) (see Problem 2.72). 



1 Hint. Use Problem 2.87. 
2.144* Given a polynomial model M(n; p,, ..., Pn) with the 
unknown parameters pi, .... Pn (see Problem 2.29). construct an 
asymptotic (for n -*■ 00) 7-confidence region for p, , . . ., pu based on 
the respective maximum likelihood estimates. 

Hint. Use Problems 2.63. 2.45, and the asymptotic version of 
Problem 1.40, i.e.. if ^"(Y„) - ^(.p», E*) as n -* *> and 1I„| * 0, 
then yp« - #i fl > '£„"'(¥„- fi„)) -» x*0»0, where ot is the 
dimensionality of the vector Y„. 
2.145*. Let n, X, and S 1 be the sample size, mean, and variance from 
the distribution - / (S^et). Show that th e result of the next, (n + l)th, 
trial is the interval (X * i<i T7 vz, n - iSV(« h- 1)/</i - 1)) with proba- 
bility -y 

I Mrtf. Use Fisher's theorem. 

2.146. (Continued from Problem 2.145.) Five independent measure- 
ments of a body gave (in grams) 4.12, 3.92, 4.55, 4.04, 4,35. Assume 
that the measurement errors are ^(0, Blh normally distributed random 
variables and construct a 95°& confidence interval for the result of 
the next (sixth) measurernent- 

2.147. Given .y"(£) = x 2 (n), where n is the unknown number of the 
degrees of freedom, construct an approximate 90% confidence inter- 
val for n, which corresponds to the realization £ = 157.4, 

ifiitlt Use the normal approximation for the ^-distribution 
(Problem 1.45). 

2.148. Given the samples from Problem 2.110. construct 7- 
confidence intervals for the respective parameters. 

I Hint. Use Problems 2,119, 2.120, 2.133. and 2.129, respectively. 

2.149. Let Xi, . . -, X„ be independent observations on a random 
variable £ with E£ z * < °°. prove that an asymptotic 7-confidence in- 
terval for the moment «* = fet* has the form {A„ k * 

Hint Using Problem 1.28, show that 

V(Vw(/W - <x k )/-JA n .2* - Am) - = *"(0, 1) 
as n -* «. 
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2.150. Let o<i be a sample correlation coefficient constructed from 
n observations on a two-dimensional random variable { = (& , &) with 

unknown parameters and Z„ — — In ((1 + e n )/(l — en))- Using the 

normal approximation S(Z n ) ~ ^V ( j", 1 , where j" - 

1 I + o > " ~ ' 

— In ^ , q = corr ( £i , fo), construct an asymptotic -^-confidence 

2 1 — Q 

interval for q. 

Hint. Use the fact that the function t(x) = In - — "**— ■ , 

I - x 

xi( — 1, 1), is monotone. 



CHAPTER 3 



Tests of Statistical Hypotheses 



3.1. Any assumption on the form or properties of the random varia- 
bles observed in an experiment is called a statistical hypothesis (or 
simply hypothesis). Jf a hypothesis Ho {called the null hypothesis) is 
formulated for the process under investigation, then we test it by con- 
structing a rule (algorithm) which allows us to accept or reject Ho on 
the basis of observations (statistical data). Such rules are called the 
goodness of fit tests (or simply tests) for the hypothesis H». If Ht, 
corresponds to a unique distribution of observations, then it is called 
a simple hypothesis, orherwise it is a composite hypothesis. 

Let the outcome of an experiment be defined by a random variable 
X = (Jfi , . . . , X n ). and lei Hit be a hypothesis about its distribution. 
We assume that the statistic T = 7\X) describes the deviation of the 
empirical data from the respective (under the hypothesis Ho) hypothet- 
ical values and its distribution is known (exactly or approximately) 
when Ho is true. Then for any sufficiently small number a > we 
can define a subset^, = (/: i = T(x), x€$T\, which satisfies (exact- 
ly or approximately) the condition 

P(.Ti^ a \H a ) ^ a. (3.1) 

Any subset STTn brings about the following goodness of fit lest for 
the hypothesis Ho: if / = T(x) is the observed value of the statistic 
T(x), then the hypothesis H is rejected for / € Z\ a , otherwise we as- 
sume that the data are consistent with Ho, or, if t^C/i„, then the 
hypothesis Ho is accepted (note that / 4Si a does not prove that Ho 
is true). If Ho is true, we may, according to our rule, reject it (i.e.. 
make a wrong decision) with the probability smaller than or equal 
to a. The number a is called the significance level of a test, and the 
set 5u. is called the critical set (region) for the hypothesis H a . The 
statistic T is called a test statistic, and the test itself is called the 
,7\*-test. 

Thus, in this technique a test is defined by the critical region 5H, 
in the range of the statistic T for a chosen significance level or. Differ- 
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em tests (generated by different statistics T) can be compared using 
the notions of an alternative distribution (alternative hypothesis) and 
the power of a test. 

Any admissible distribution F x = F of a sample X, which differs 
from the hypothetical (under the hypothesis fl ) distribution is called 
an alternative distribution, or alternative. The set of all the alterna- 
tives is called an alternative hypothesis and is denoted ff, . The power 
/unction of the Jjf„-iest is a functional 

w{F) m w(S\<.\ F) = p{Tes? lt \F> oa> 

on the set of all the admissible distributions i.:>~|- Thus, W(F) is the 
probability that the values of the test statistic are in the critical region 
when Fis the true distribution of the observations. If F £ H\, the value 
of IV{F) is called the test power under the alternative F. This value 
characterizes the probability of making a correct decision (rejecting 
Hn) when Ho is not true. A test whose power under the alternatives 
is greater is chosen as the best one compared to other tests with the 
significance level a. 

Unbiasedness is a desirable property for the J/T»-test. This means 
that the condition 

tri^ a ; F) =S a Vfe/Zo (3-3) 

must be met in addition to the condition 

W{^ K ; F)&a VF€ //, (3.4) 

(i.e., the probability of getting into (he critical region must be greater 
under the alternative than under the null hypothesis). 

The power function cannot always be found (for this wc must know 
the distribution of the test statistic under all the alternatives), but it 
is frequently possible to investigate its asymptotic behaviour when the 
sample size n tends to infinity (to show that the power function de- 
pends on the size of a sample, we write H'„iF)). When studying the 
asymptotic properties of tests, we first check whether they are consis- 
tent. Consistency implies that 

iim W„(/\> = I VFi Hi. (3.5) 

Thus, when the number of observations is large, a consistent test shows 
any deviations from the null hypothesis with a probability close to 
unity, i.e., if any fixed alternative is true and n is large, then we get 
into the critical region with a probability close to unity, and hence 
reject the null hypothesis which is false (i.e., we take a correct decision). 
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Other properties of a consistent test can be investigated when study- 
ing the asymptotic behaviour of the power W„(.F fl ) under "close" alter- 
natives F„, i.e., when the sequence [F„) of the alternatives gets closer 
(in some sense> to the null hypothesis Ha as n -» w. The "threshold" 
case of finding a sequence \F„) for which 

lim W„{F„) = y , a < y < I, (3.6) 

FJ — w 

is the most interesting. Here we have to calculate 7. 

3-2. Kolmogorov's test and the x^' test *re frequently used to verify 
the hypothesis Hoi F t (x) - F(x). 

Kolmogorov's test is applied when F{x} is a continuous function. 
The maximum deviation D n = D„(X) = sup \F„(x) - Ftx)| of 

- <** <c x <. m 

the empirical distribution function F n (x) from the hypothetical func- 
tion F{x) is the test statistic For fixed x the value of F n (x) is an opti- 
mum estimator for F$x) and-, as n grows, we have F„(x) -* F(x). This 
means that when the hypothesis H is true, D„ does not essentially 
deviate from zero at least for large n. The limiting Kolmogorov's distri- 
bution K(t) = S C-lVexp { -2/^j for which published tables 

are available gives a good approximation of the exact distribution 
P(Vn£>„ ^ r) for n > 20. 

The critical region of the test is defined by the inequality v/nZ>„ 3= f„, 
where K(t a ) = 1 - a. For example, r a ., - 1.23, t a .os = 1.36, 

'o.D! ■■ 1.63. 

The original statistical data are often grouped preliminarily. Let 
X = {X, , , . . , X„) be repeated independent observations on a random 
variable £ with the set of possible values A. We consider a partition 
A = A, U . . . U Aw, A, n Aj = 0, / ^ /. and suppose that f; is the 
number of the units of the sample X in the subset Ay, and 

Pi - Pj(F) = -J dF(xy is the probability that vj is in A_, for the given 

distribution f of |, J m 1 /V (p t + ... + v,v — n, p, + ... 

+ p N = 1). Then the frequency vector p = (cl, . . ., v N ) has a poly- 
nomial distribution M(n; p = (pi , . . . , p^» and every hypothesis on 
the distribution */%£) is transformed into a respective hypothesis about 
the vector p from the distribution M(n; p). Thus, the given method 
implies a transition from the original observations X = {X 1 , . . . , X^) 
to the frequencies » — (ej, . , ., vs) with which the sample units get 
in the respective subsets At, . . ., A*. Thus representation of statistical 
data is called the method of grouped observations, and the subsets 
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Ai, . - ., Ay are called the grouping intervals. The relative frequency 
vj/n of get ting into the interval i, is a consistent estimate 1'or the prob- 
ability Pj, and we may choose various functions of the differences 

\^L - pj t j = 1, . , , , jV, as a measure of discrepancy between the 
empirical data and the hypothetical values p°. The measure 

suggested by K. Pearson is frequently used. If fio is a simple hypothe- 
sis which uniquely describes the probabilities p° = (p°, . . ., p%), then 

for < pj < I, y = 1 N, and n ~* =° the respective goodness of 

fit lest {called the x 2 ~test) is asymptotically defined by the critical 
region \xl ^ xi-„.n- i I, where XJU'S a p-quantiie of the distribution 
X*(/)- Other applications of this method can be found in [7, Chap. 3]. 

3.3. Verifying the homogeneity of statistical data is an important 
problem. Let X = (X, X„) and Y = ( ^i I'm) be indepen- 
dent samples which describe the same process or phenomenon but 
are generally obtained under different conditions. We have to find out 
whether they are samples from the same distribution or whether the 
distribution law has changed from sample to sample, i.e., we have to 
verify the homogeneity hypothesis //<> that F\(x) - Fi(x), where F,(x) 
and Fi(x) ate the distribution functions of X and V, respectively, Smir- 
nov's homogeneity test for continuous distributions is frequently applied 
in this case. The test is based on the statistic £>*m -= Aun(X, Y> = 

sup \F ia (x) - Fi„(x)\, where Fi„{x) and F im (x) are the empirical 

distribution functions constructed from the samples X and V, respec- 
tively. When the hypothesis Ha is true, the functions Fi„(x) and F2„(x) 
get closer as the sizes n and m of the samples grow, and there- 
fore the statistic Au» practically does not deviate from zero. The exact dis- 



tribution of P ( /— r }HL- D nm s£ / J is approximated by the limiting 
\-\« + m j 

jut ion K{t) . Tl 
\ * m n 

I ~ *-*» 

■yjn + m 



Kolmogorov's distribution K(t). The critical region of the test is found 
from the inequality /- nm — D» m ^ t„. where K(t K ) = 1 



The homogeneity x z -test is frequently used to verify the homogenei- 
ty of discrete data or the data which can be made discrete by the 
method of grouped observations. This method is also used to compare 
any number of samples. 
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Let us carry out k series of independent observations of sizes 
rt\, . . ., r>k and observe in each series a variable feature assuming one 
of the s possible values (outcomes). Let v k j be the number of the reali- 
zations of the ith outcome in the /th series f S "v " n J> 

/ = 1, . , ., k 1 . We will test the hypothesis Wo that al! the observations 
were carried out on the same random variable. The quantity 



*-.(ES£-> 



i»i y-i 

k 

where c,. = 2 %< ' = 1> ■ - -» *» « = n t + ■ • - + «*> is in our case 

j-i 
the statistic of the x 2 -test. 

The critical region is defined as X\ > xf -a.o- ikk-i)* where the 
test boundary is found in the tables for the quantiles of the 
a^-distri button. It n is. sufficiently large, the probability that the true 
hypothesis will be rejected is approximately <x 

3.4. If X = (X i , . . . , X„) is a sample from the distribution ^(f) 
and the set .V of all the admissible distributions of the observable 
random variable £ is given in a parametric form as i/ r = {F(x; ffi, 
6 - (0,, . . ., 6 r ) 6 ©(, then ihe hypotheses about the distribution S(g) 
are formulated in terms of the unknown parameter and are called 
parametric. In the general case the parametric (null) hypothesis is 
given in the form H : 6 C 8 for a subset Q C ©. Then the alternative 
hypothesis is of the form H, : 9 6 ©i = © n ©o. Thus, in a parametric 
model the alternative hypothesis has a form similar to that of the null 
hypothesis, and a deviation from the null hypothesis is equivalent to 
accepting a concrete alternative. 

In the general theory of testing parametric hypotheses the tests are 
directly specified by the respective critical regions in the sample space 
;**"— \\ = {xi, . . ., x„)l. Thus, the test for verifying Ho at a sig- 
nificance level a is given by a subset ,-/';., c ■?" for which the condition 

P # (tT€ ir ia ) «o. vfeGo (3.7) 

(an analogue of (3.1)) is met. Then the test (called the &i a -lest) is con- 
structed as follows. If x is the observed realization of the sample X, 
then the hypothesis H v is rejected for XJE.^',, (the alternative hypothe- 
sis H\ is accepted), and if x e iSi a = •?;„, then the hypothesis Ha is 
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accepted. The power function is in this case written as 

sv(9) = m&iu', A = p*<x s M>h e*Q 

(compare it with (3.2)). 

The probabilities of erroneous decisions for the ^To-test are ex- 
pressed through its power function as follows. The probability of 
Type I error (rejecting Ha when it is true) is equal to IV($), 6 € O 
(in symbols we write P(f/i | Ha)), and the probability of type II error 
(accepting H a when it is false) is equal to 1 - W(8), 8 6 ©t (in symbols 
we write P(Ho\H{y). 

We now formulate a rational principle for choosing the critical 
region in terms of the probabilities of errors, i.e., for a given number 
of trials we find the boundary for the probability of Type I error 
choosing the critical region for which (he probability of Type II error 
is minimal. 

Let .'#U and .^'7o be two tests of the same significance level a for the 
hypothesis H a . If W{-r\„; 8) $ WQ#L,l 0) for 9 <■ O and W(,-r?„; 9) S 
W(.'^T= ; 9) for 6 € Si (the strict inequality holding for at least one 
( 6 9i), then we say that the i^"*„-test is uniformly more powerful com- 
pared to the i?T,,-test, the first test being preferable because it leads 
to smaller errors. If the indicated inequalities hold for any .^„, then 
$T\„ is called a uniformly most powerful (u.m.p.) test. IT Hi is a simple 
alternative (the set &i consists of one point), then we use a most 
powerful test instead of a uniformly most powerful test. In some cases 
this method of comparing tests allows us to find an optimal (best) 
test for a given problem. It is sometimes possible to construct optimal 
tests in the class of unbiased tests, i-e., when the condition W(8) > ct 
v8 £ &i is met in addition to (3.7). 

Theoretically, it is sometimes convenient to deal with the ran- 
domized tests when for a given observation x the hypothesis Ho is re- 
jected with the probability <p(x) and accepted with the complementary 
probability 1 - p(x). The function ^(x), ^ <p(x) sj 1, x e ;3T, is called 
a critical function. The construction of the non- random! zed ."Sio-test 
described above corresponds to the case when ip(x) is an indicator of 
the set j#i„, i.e, <p(x) = I for x G &i a and <p(x) = for x i&la. "The 
power function of a randomized test is defined by the relation 
W(«) a W( v ; 0) = E,p(X). 

3.5. Most of the methods for constructing optimal tests are based 
on the Neym an -Pearson theory of tests which states that verifying a 
simple hypothesis against a simple alternative one can find a most 
powerful test. Indeed, if 8 = {6o, Si), then for any significance level 
a the most powerful test for verifying the hypothesis Ha: 9 = 8a 
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against the alternative H\: 8 = 8\ exists and is defined by the critical 
region 

n 

where £,(*; 0) = JJ fixr, 0) is the likelihood function (see [7, 

Sec 4.2] on some properties of discrete observations). 

If we test a simple hypothesis Ho: 8 = do again St a composite alter- 
native Hi: 0€ "v 10 O ), then a uniformly most powerful test exists 
when the critical region i^"!^ = : £"*a{do; Si) defined in (3.8) is indepen- 
dent of the concrete 8\ 6 © V {8a\. In this caSefc£"Jo is a u.m.p. test. 
This is typical for an important class of models ,9~ with a monotone 
likelihood ratio (i.e., the models having a sufficient statistic T(X), 
where the function /(x) s g(7"(x); 0])/g(7"(x); 8 ) is monotone in T (see 
the factorization test in Sec. 2.3}), and also for one-sided alternatives 
Hf: 8 m B {& is a scalar) [7, p. 192]. Moreover, for such models the 
u.m.p. test for verifying a simple hypothesis Ho'. 8 - 8o against a rig rH- 
sided alternative Hfl 8 > 8 i& simultaneously a u.m.p. test for verify- 
ing a composite hypothesis Ho: 8 ^ Co against H*o( the same sig- 
nificance level (a similar statement is also true for a dual problem of 
testing H : 8 ^ ft> against H{~: 8 < da [10]). 

Specifically, for the exponential model defined by the density 

fQt, 8) = exp M(0)B(jt) + C(fi) + D(x)), 

the statistic 7"(X) = 2 B(X,) is sufficient, and if the function A (8) 

is strictly monotone, the u.m.p. .^'L-tests have the form given in 
Table 3.1. 



Table 3. T 








H,+ : 6 > 6o 


Hfi <8<i 


a (an 


I r(x) s cj) 


ir(x>> c„-| 



In some cases when testing a simple hypothesis Ho: 8 = Go against 
a two-sided alternative H\\ 8 ^ 6o, it is also possible to construct a 
u.m.p. unbiased test [7, p. 196]. 
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It is sometimes possible to find the solution oF this problem using 
the following technique If the ^f m and ST,*, u.m.p. tests exist in the 
investigated model against the one-sided alternatives tif and H*, 
respectively, then a test of the form $£, = ^"f^Ui?'^ is used, 
where a ( + az = a. 

The case of small deviations from the null hypothesis H : 6 = fti 
is especially interesting. When investigating the properties of a test, 
we may restrict ourselves to the analysis of the local behaviour of the 
power function W{B) in the neighbourhood of the point 9o- This ap- 
proach allows us to construct a local most powerful test even if the 
u.m.p. test does not exist p. P- 199]. 

3.6. In many cases the fact that testing a simple hypothesis about 

8 is an inverse problem to that of constructing a confidence set for 

9 can help greatly. Indeed, if .^(X) is a y-confidence set for 8, then 
ytfto ■ |x: 8a e .^(x)] defines the acceptance region for the hypothesis 
H : 9 = 0o with the significance level or = 1 - y. The converse is also 
true, i.e., if for every ft, € 9 there is a test $&, = S?„<ft>) for verifying 
the hypothesis //©; = So, then, having found the subset ,£,(*) = 
{8: x£ &iaiG)), 7 = I - «, for every x 6,'iT, we will prove that-^fX) 
is a 7-confidence set for ft Thus, if one of these problems is solved 
for a certain model, then this algorithm can be used to solve the other 
problem. Here the u.m.p. tests correspond to the shortest confidence 
sets and vice versa. 

3.7. The likelihood ratio me/hod (l.r.m.) is universal for constructing 
the tests for verifying composite parametric hypotheses. The general 
form of the likelihood ratio test for testing a hypothesis Ho: 9 € 8o is 

&i„ = ^(Oo, 0) = {x: X„<x) = sup L<x; <7)/sup i(x; fl) < c„], 

? f G : t) r m 

where the boundary c„ is chosen from the condition 

W{$) = P„(MX) < c„) jS a v» 6 So. 

In practice this approach gives satisfactory results. Besides under some 
conditions the likelihood ratio test possesses the optimally for large 
samples. 

If the regularity conditions which ensure the existence, uniqueness, 
and asymptotic normality of the maximum likelihood estimate 

i„ = 0, , §,„) for the parameter 9 = (8 U .... ft) are met (see 

Sec. 2.4), then, given a simple hypothesis H D ; 6 = fi for large samples, 
the likelihood ratio test is asymptotically defined by the critical region 

*£, - !x: -2 In A„(x) > x?- . r }, 
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where Xp,r is ap-quantile of the distribution * 2 M. This test is consis- 
tent ( yv n 0) — 1 for n -* =o v$ ^ do), and asn-« its density for the 
dose alternatives of the form flj n> = 6„ + 0/vn, & = (jBi, . . -, ft) ;* 0, 
satisfies the relation 

^ n (S tn> ) ** 1 - F,{xL,rl \\ 

where X 2 = #' I(flo)£. 1<#) is the information matrix of the model, and 
F r U\ X 2 ) is a function of the non-central ^-distribution with r degrees 
of freedom and the skewness parameter X 2 [7, p, 2)0], The likelihood 
ratio test possesses similar asymptotic properties when the hypothesis 
H is composite [7, pp. 211-213). 



Problems 

Goodness of Fit Tests 

3.1. Given the data of Problem 1.13, check whether they ate consistent 
with the hypothesis M> that the coin was symmetric Take the sig- 
nificance level (a) 0.05; (b) 0.1. 

3.2. Given the data of Problem 1.14, test the hypothesis H that the 
numbers are random. For what significance level should the hypothesis 
H a be rejected? 

3.3. In n = 4000 independent trials the events A,, At, Ai which 
constitute a complete group were realized 1905, 1015, 1080 times, 
respectively. Given the significance level 0.05, check whether these data 
are consistent with the hypothesis Wo: pi ■ 1/2, pi = p-a = 1/4, where 

Pi - P(<4.f). 

3.4. The number ir written in decimal form contains in the first 
10 002 positions after the decimal point the digits 0, 1, . . ., 9 respec- 
tively 968, 1026, 1021, 974, 1014, 1046, 1021, 970, 948, 1014 times [1]. 
Given the significance level 0.05, can we consider these digits to be 
random numbers? For what significance level should the hypothesis 
be rejected? 

3.5. Are the data in Problems 1,16 and 1.17 consistent with the 
hypothesis I hat the dice are symmetric? 

3.6. A large batch of goods may contain some defective items. The 
supplier assumes thai Ihey constitute 3 To of the batch, while the buyer 
insists on 10ft. The contract stipulates thai if 20 randomly chosen 
items contain no more than one defective item, the batch is accepted 
on the supplier's terms, otherwise it is accepted on the terms of the 
buyer. Find <1) the statistical hypotheses, the test statistic, its domain, 
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and critical region; (2) the distribution of the test statistic. Type I and 
Type II errors, and their probabilities. 

3.7, Given the significance level 0.1, are the data in Problem 1.19 
consistent with the hypothesis Ho that the times shown by the watches 
are uniformly distributed on the interval (0. 12)? For what significance 
levels is the hypothesis H accepted? 

3.8. When breeding pea-plants, Mendel observed the frequencies of 
various seeds produced by hybrids of round yellow peas and wrinkled 
green peas. These data and the respective probabilities as predicted 
by Mendel's theory of heredity are given in the following table: 

Seeds Frequency Probability 



Round and yellow 


315 


9/16 


Wrinkled and yellow 


101 


3/16 


Round and green 


■108 


3/16 


Wrinkled and green 


32 


1/16 



Total n »■ 556 



Check the hypothesis ff thai the frequency data are consistent with 
the theoretical probabilities (at the significance level a ■£ 0.9). 

3.9. Using a table for some function (cos A", e*, In jr. etc.), write out 
100 digits choosing the second digit after the point at each value. Test 
the hypothesis that the numbers 0, 1, . . ., 9 are random at the sig- 
nificance level (a) 0.05; (b) 0.01. 

3.10. Grouping the data in Problem 1.21 into N = 4 equiprobable 
intervals (under the hypothesis Ha), test the hypothesis Ho- 
F f (x) = 1 - e~", x ^ (at the significance level 0.1). 

3.11. Given a sample X =» {X\, . , .:, X„), test the hypothesis that 
the observable random variable £ is distributed exponentially, i.e., 
ff a : ft{x) = 1 — e "*"'*, Jf > (the parameter 8 > is unknown). Ap- 
plying the method of grouped observations with the intervals 

4/ = 10" - Do. -A*). / = 1 N - l> A* = l(N - i)a, °°), where 

a > is a given number, construct a x 2 goodness of fit test for the 
hypothesis H a . Analyze the data in Problem 1.21 from this point of 
view, assuming that N — 3, a = 1. 

3.12. In Fisher's genetic model [1) the probabilities of four types 
of the offspring are 

2 + 6 1 - S 6 

J3,(6) = =-Z£ , piW = p 3 (0> = i~ , P4{ffi = j, 
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where g (0, 1) is an unknown parameter. Construct the x 2 -test to 
check whether this model is consistent with the actual data. 

3.13. In 8000 independent trials the events A, B, C which form a 
complete group occurred 2014, 5012, 974 times, respectively. Is the 
hypothesis H : P(A) = 0.5 - 20, P(fl> - 0.5 + 0. P(C) = 6, 
< 8 < 0.25, true at the significance level 0.057 

| Hint. Use the solution to Problem 3,12. 

3.14. Test the hypothesis H a : -*?{& = n(0). where 6 is an unknown 
parameter, for the data of Problem 1.23. 

I Hint. Take the sample mean as the estimate for the unknown 
I parameter 6 (7, p. 152], 

3.15. The number of gold particles £ m a thin layer of suspension 
under a microscope was registered in equal time intervals. Using the 
data from the table 



The number 1 2 3 4 5 6 7 Tbtal 

of particles 

m, 112 168 130 68 32 5 l 1 Km, - J1S 



test the hypothesis H ; ^(Q — 11(9), where $ is an unknown 
parameter. 

3,16. The table below gives the number m t of 0.25-km 1 plots in the 
southern part of London each of which has been bombed i times dur- 
ing the World War II. Check whether these data are consistent with 
the Poisson distribution law at the significance level at = 0.05. 



1 





1 


2 


3 


4 


5 and 
more 


Total 


m, 


229 


211 


93 


3S 


7 


1 


Em, = 576 



3.17. Of 2020 families with two children, 527 families have two boys, 
476 families have two girls, and the remaining 1017 families have chil- 
dren of both sexes. Given the significance level 0.05, can we consider 
that the number of boys in two -children families is a binomial random 
variable? 
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3.18. Among the 2000 individuals, 181 suffered from flu once, 9 
had it twice, and the remaining 1810 were healthy. Given the sig- 
nificance level 0.05, do these data correspond to the hypothesis that 
the number of illnesses an individual suffers is a binomial random 
variable? 

( Hint. See the solution to Problem 3.17, 

3.19*. Investigate the asymptotic behaviour (as »-»«>) of the mean 
and variance of the statistic X\ of the : x 2 -test against "close" alterna- 
tives of the form 

I Hint. Use the formulas for E(,v£|p) and D(A^|p) given in 

I [7, p. 145]. 
3.20*. Let w = w>(n, N) be the number of empty intervals when n 
observations are grouped into N equiprobable (under the hypothesis 
W ) intervals. Consider the hypotheses of the form 

/*f>: Ar-J^ = jr;(i +;$*). ■/-■. ■■•>"> 

where 

A) J* 

max \bj\ *s e < «, y]bj m 0, ,b*(N) = ±j/jb}~*b*>0 
***** ,_, " ,_, 

for N-* oo. 
Prove that for n, N~* eo, n//V = g > we have 

E( w |«f*>> = Ne"« + ^&lN) "*e-* + 0(N ui ), 

D{m\fifi = Ne _e (l - e- e (l + e))(l + OifT"^ 

I AM. Use formulas (3.16) from [7, p. 155]. 
3,21. Entrants to a university are divided into two groups, 300 peo- 
ple in each group. In the first group 33, 43, 80, 144 individuals got 
grades "2", "3", "4", "5", respectively. The data for the second group 
were 39, 35, 72, 154 individuals, respectively. Can we consider bdth 
groups to be homogeneous at the significance level 0.057 
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3.22. The table 



"J 


1072 


1133 


2455 


1995 


*i 


22 


23 


49 


33 



gives the data on the death rate of mothers having their first baby 
in four periods of time [1] (rtj is the number of mothers, vj is the num- 
ber of deaths). Test the hypothesis Ha that the death rates in these 
periods do not differ. 

1 Hint. Use the homogeneity x 2 -test for trials with two outcomes. 
3.23*. Let two series of n t and tiz independent trials be carried out, 
each of which has either the outcome A or the outcome A. The results 
are given in the table 







(J) 


(2) 




£ 


A 




^u 


Vi? 




"i- 


~A 




"ii 


fa 




Vz> 


r 


*». 


t — «] 


"•i — fi 


n 


= «1 + rtl 



where ihe columns show the number of the realizations of the respec- 
tive outcomes in each series.. 

(1) Make sure that the statistic X\ used to test the hypothesis H a 
that the trials are homogeneous can be written as X\ = Z\, where 



Au _ "lA \ nrli 






(2) Prove that ^'{Z„\ffo) -».^(0, 1) as «i, m -* «= and construct 
a test for verifying the hypothesis H<>: pi =■ pi against the one-sided 
alternative Hi: p\ > pi (here pi is the probability of the realization 
A in the trials of the fih series, i = 1, 2). 

3.24. Let vi, . . ., VN be independent random variables with 
j?i.vi) = IKfii), i = 1, , . ., N, where the parameters Ot are unknown. 
Assume that vi + ... + vn ■= n and construct a test for verifying the 

homogeneity hypothesis M a : #i = = &n. 

| Hint. Use Problem 1.S4. 



92 Theory and Problems 

3,25*. {The empty box lest.) Suppose that X « (Xi, ..., X„) is a 
sample from the distribution -/"(£) = R(0, I), = X im < X m ^ 
X w ^ . . . < Jt"(„> < A'<„ + i) = 1 is its ordered sample, and 
fl, = (A"*!- 1), Xm], i = I, ...,« + t, are the sampling blocks generat- 
ed by it. Suppose that Y = ( Y\ , . . . , Ym) is a sample from a different 
distribution -*"(ij) on the interval [0, 1] which is independent of X. 
The distribution function Fix) of -/(q) has the density /<■*) = F'(x). 
The number of units of the sample V, which are in the block Bi, i = I, 
, . ., n + I, is denoted ki ■ x;(/i, m). 

(1) Prove that under che homogeneity hypothesis Ho: -A£) = -jflfo) 
the vector of the block frequencies x — (xi, . . .. x n + i) assumes all 
the possible values with the same probability (C" + m )"'. Show thai 
the conditional distribution -f{(t , . . . . £n + 1 1 fi + - ■ - + f« + i = w )> 
where the random variables & , . . . , £«+ j are independent and have 
the geometric distribution Bi{\, p) with an arbitrary p 6 (0, 1), has 
the same form. 

(2) Consider the statistic s (n, m) (the number of the empty blocks), 
viz., 

n+ 1 

s (n, m) = £ /(x, = 0), 

BO I 

where !(■) is an indicator, and use the representation 

+ £ ■ * i = ml, 



VU,(n, «B» - •J'i £ 7 t& * °>l*> + 



which stems from Sec. 3.1, to prove that s (n, m) has a hypergeometric 
distribution H(rt + \,n + m, n). Derive from this the expression for 
the mean and variance of the statistic 5o(n, m) under the hypothesis 

(3) Prove that if n, m -> oo so that m/n = q > 0, then 

jf*<a>(». m)\H ) ~ -WO + eh b»V(1 + e>') 

(4) Prove that under the specified conditions for any alternative H\ 
defined by the density /O) m I, xi [0, 1J, we will have 






e/W i + e 

Using these results, formulate the empty box test for verifying the 
homogeneity hypothesis Wo [7, p. 162]. 
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Hints. (1) Use the fact that the conditional distribution 

of the vector x = (», x„*i) for the fixed values 

(X w , ..., Xin}} = (jfj, .... *„) is a polynomial distribution 

M(m; Xj, x 2 ~ x, jfj, - x„- [, 1 - *„). Then use Problems 

1.39 (5) and 1.31. 

(2) Consider the independent random variables k distributed as 
P(f( = r> - P<£. = r]6 > 0), r = 1, 2, . . ., , = 1, 2, . . ., and 
show that 

P<£. + ... + £ = m) - C'„\W/f m ' \ q = 1 - p. 

(3) Use the normal approximation for a bin omia l distribution, 
i.e, for /i -* oo and < /? < 1, A: = np + t\/npq, \t\ ^ r <=>, 

&(*; it, p) = C*p V~* = -! *,* >tl l e "' v *. 

V27rrtp5 
Write the probability P(So(n, "t) - ft) in the form 

P(*o(n, /m) a £) = b{Jc; n + I. p) 

x ft(/i - A; w - 1, p)/£(rt; n + m, p), p = 1/(1 + ff >. 

(4> Use Problem 1.31 to calculate E[/(v f = 0)]//,] b 
P(*i = 0|/fi). While estimating the integral, apply the Cauehy- 
Schwarz inequality 

Ci \ i i ' 

\gr<x}gz(x) dx\ ^ J«fM <ftf JgfW *c 

for g,(x> = Vl + o/W, *:t« = fff'f*). 

3.26. Test the independence hypothesis for the following bivariate 
contingency table: 



*1 tj £>J 



3O09 2S32 3008 8849 



to 3047 3 OS I 2997 9095 



2974 3038 301S 9030 



9030 8921 9023 26 974 



The significance level is O.05. 
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3.27, Of 300 university entrants who passed the enhance examina- 
tion, 97 obtained the (op grade at school, and 48 got the top grade 
in the entrance examination, hut only 18 obtained the top grade at 
school and in the examination. Test the hypothesis thai the school 
marks and entrance marks are independent (the significance level is 
0.1). 

3.28*. Consider the following bivariate contingency table: 







h 




Z 




t 







1 


pii 




I'll 


'i- 





"21 




Vi] 


'■!■ 


c 


P.( 




y-2 


It 



(1 ) Make sure that the statistic j?1 [7, p. I<5<>1 for testing the hypothe- 
sis Ha on the independence of the factors £i and & can be represented 
as jfj = Z\, where 

7 s 3/2 ( , *l-f*l\ / ,- — — fv\i »,A invivi 

Z„ •= n f vii — — I /Vfr «>.|f-j = [ - — — - — I / . 

(2) Show thai the sample correlation coefficient is q„ =■■ Z„/Vn ant 
hence Z„/v7t -* q - con (h . fe) as n -» <=c (see Problem 1.38). Derive 
the equations 

VPM)P(S)P(i))Pl|i \¥(A}V{A)\ 

with the events ^4 = {& = !), A =$ [£i = 0(, B = (fa = 1), 
B= Ifc-OJ. 

(3) Prove that -j?{2 n \Ho') — ^V(0, 1) as « — oo and then construct 
a test for verifying the hypothesis Ho against the alternative H\\ 
P(A [B} > P(A {B) which means that the events A and S arc positively 
conjugate (the probability of the pair A and B is greater than that 
of the pair A and B). 
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3.29. We have two groups of data classified by two features, i.e., 
"accepted A- rejected A for entrance to the college" and "male B - 
temale B": 





B 


B~ 


L 


A 


97 


40 


137 


A~ 


263 


42 


30S 


E 


360 


82 


n = 442 





B 


7/ 


2 


A 


235 


38 


273 


X 


35 


7 


42 


E 


270 


■15 


n = 315 



For each table test the hypothesis Ho that the features /4_and B are 
independent against the alternative //,: P(.4|B) > P(^4)S). 

3,30. The table below [81 gives 818 cases classified according to two 
features, i.c, vaccinated against cholera A and healthy B 





B 


~B~ 


E 


A 


276 


3 


279 


A 


473 


66 


539 


E 


749 


69 


818 



Construct a test for verifying the hypothesis H<> that the features A 
and B are independent against the alternative fit that A and B are 
positively conjugate (i.e., that the vaccination is effective). 

3.3 J. Given the significance level 0.001, can we consider the sequence 
1.05, 1.12, 1.37. 1.50, 1.51, 1.73, 1.85. 1.98 to be a realization of a ran- 
dom vector whose all eight components are independent similarly dis- 
tributed random variables? 

3.32*. Assuming that 

. n- 1 

**(*) ■ e z t * = 4 n (l + z + . . . + o 

is a representation of the generating function of the statistic T n (the 
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number of inversions in a repeated random sample of size «) 

as n -* oo. 
Hint. Consider the characteristic function and show that 



[7, p. 170], prove that V J (T„) - ,/ (" ( " 4 ° . j£\ 
Consider the characteristic function and 

E wp J* (r n - ^i>) -*,] - exp ( -rV2] 

for n — * °o and |f| ^ c < <*=. 

3.33. Test the hypothesis that the data in Problem 1.22 are random. 
] Hint. Use the asymptotic variant of the test based on the statistic 
I Tt, (see the previous problem). 

3.34. Obtain samples (of sizes n = 20, 50, 100) of uniformly dis- 
tributed random numbers. Use the x 2 - and Kolmogorov's test to verify 
the hypothesis that the distribution is uniform. 

3.35. Obtain samples (of size n = 100) of approximately normally 
distributed numbers by summing iip N uniformly distributed terms 
(N = 4, 8, 12). Use the x 2 - and Kolmogorov's tests to verify the 
hypothesis ihat the distribution is normal. 

3.36. Obtain a sample Xt <of size n = 200) of uniformly distributed 
random numbers. Using Smirnov's test, verify that (X^, i = 1, 2, . . ., 
100) and {Xn^ i, i = 0, 1, .... 99) are samples from the same distri- 
bution. 

3.37. Simulate a sequence \Xi\ of polynomial pseu do -random vari- 
ables assuming the values of 1, . ... N, Form two samples {Xt,, i ■ 
I, .... «) and (Aju )( I = 0, . . ., rt - I) from this sequence and use 
the x i_ tesl to verify the hypothesis that the values corresponding to 
these samples are independent Work with N = 2, 4, 10, and n = 100. 

3.38. Obtain samples (of sizes n - 10, 20, 40, 100) of uniform 
pseudo-random numbers. Using the statistics T„ (the number of inver- 
sions in the ordered series of the sample), test the hypothesis on ran- 
domness. 

A Choice Between Two Simple Hypotheses 

3.39. Let X = (Xt, . ■ ., X„) be a sample from the binomial distribu- 
tion Bi(k; B). Construct a Neyman -Pearson test to verify the hypothe- 
sis H : = 6q against the alternative H\ : $ - 6\ , < So < #i < I, and 
calculate its power. 

3.40. (Continued from Problem 3.39.) Show that for n -* ™> the test 
can be asymptotically defined by the critical region 

n 

[T 3s kn$o - ««V/t«0o(l - 0o)), T = 2 Xi, *(u Q ) = a, 
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and its power W n (ftC) satisfies the relation 



for fi, = 0<"> = ft, 4- (3/vn, fi > 0. 

] Hint. Use the De Moiv re -Laplace theorem. 

3.41. Given a sample X = (Xj, . . ., X„) from Poisson's distribution 
ri(0), construct a Ney man- Pears on test to verify the hypothesis H : 
8 — ft, against the alternative rV,: 8 = fl,, < ft, < $,, and calculate 
its power, investigate the asymptotic behaviour of the test's charac- 
teristics for large samples. 

Hint. Use Problem 1.39 (4) and the normal approximation for 
Poisson's distribution with a growing parameter. Consider a 
"close" alternative of the form given in Problem 3.40. 

3.42. Observe the number of successes before the first failure in an 
experiment used to verify the hypothesis H D i 8 = ft, against the alter- 
native //, : 8 = 0i, < ft, < 8, < 1, in a Bernoulli scheme with an 
unknown probability of success 6. Construct the most powerful test 
at the significance level a = 0q, where s Js 1 is a given number, and 
show that the probability of Type II error for this test is tf = 1 — 9\. 

3.43. Let X = (X t , . . -, X„) be a sample from the exponential distri- 
bution P(ft 1). Construct a most powerful test to verify the hypothesis 
H Q : 6 = ft, against the alternative //,: 6 = 9, and calculate its power 
function. 

I Hint. Use the fact that -4(2Xi/6) = \p> (see Problem 1.51) and 
I the solution to Problem 1.39 (2). 

3.44. Test the hypothesis H ; 9 = against the alternative H,: 9 = I 
for Cauchy's distribution C(0). Show that for the significance level 
ot = 1/2 — (1/ir) arctan (1/2) = 0.352 the most powerful test con- 
structed from one observation has the form .ST[„ = iX > 1/2) 
and its power is 1/2 + (1/t) arctan (1/2) = 0.648. If or = 
1/ir (arctan 3 - arctan I) = 0,148, then the test has the form 
#"io = (1 < X < 3 |, and its power is (1/ir) arctan 2 = 0.352. 

3.45*. Construct a test to verify the hypothesis H a : -*■"(£) - 
R ( - a, a) against the alternative //, : _^(£) = .^(0, a 2 ) (the parameters 
a and a are given) under the condition that the observable random 
variable £ is distributed symmetrically about zero. Consider the case 
of a large sample. Analyze the data: -0.460, -0.114, -0.325, +0.196, 
-0.174 for a = 1/2 and a 1 = 0.09. 

\Hint. Use the Central Limit Theorem. 

3.46, In a sequence of independent trials the probabilities of positive 
outcomes arc the same and equal to p. Construct a test to verify the 

7—889 
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hypothesis Ho: p = against the alternative //,: p = O.01 and find 
the smallest sample size for which the probabilities of Type J and 
Type 11 errors do not exceed 0,01. 

3/47. Given a sample X = (X u X») from the distribution 

^V($, a 2 ), find a most powerful test to distinguish between two simple 
hypotheses Ho- 6 = 6a and H\- & = 0t. Calculate its power and show 
that it is unbiased, 

3.48. (Continued from Problem 3.47.) Define the minimum sample 
size n* = /i*(ce, ff) for which the probabilities of Type I and Type II 
errors are not greater than a and 0, respectively. 

3.49. Let X and Y be the sample means of two samples with sizes 
n and m from the distributions *#X&t , a 2 ) and ^(02, a 1 ), respectively. 
Using the statistic T — (X — T)/a. where <r 2 - <t\/n + 4%/m, con- 
struct a test to verify the hypothesis H},: A = B t ~- $2 = against the 
alternative fix- A > 0. 

Suppose that the probabilities of Type I and Type II errors are a 
and &, respectively, and n is the size of the first sample. Find the mini- 
mum size m* of the second sample, such that the probabilities of er- 
roneous conclusions were not greater than a and 0. 
\Hlnt. Use the solution to Problems 3.47 and 3.48. 

3.50. Given a sample of size n, construct a most powerful test to 
distinguish between two simple hypotheses with respect to an 
unknown variance of a normal distribution (the mean is known). Find 
the test's power. 

3 SI*. Given an observation X, distinguish between two distributions 
with the densities foix) (the hypothesis Ho) and f\ (x) (the hypothesis 
Hi). Consider a test of the form 

&i{c) = {*; Mx) £ c/ (je>(. c> 0. 

Let a{c) and /3(c) be the probabilities of Type I and Type II errors, 
respectively. Show that 

I — a{c) a{c) 

(2) if <x(c) ■+■ 0(c) ^ 1, the test is unbiased; 

(3) min (a(c) + #(c)> = <*(1> + 0(1) because the £T,'(I)-test mini- 

c 

mizes the sum of the probabilities of the errors; 

(4) suppose that X is a repeated sample of size n, i.e., X = 

F? 

(JTi, .... JO, /Ax) = Jl fj(x t ),j = 0, 1, and the probabilities of 
errors for the S£(J)-test are <x„ and ft,. Prove that if \fo(x) In (/i(*V 
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fo(x)) dx = 8 < 0, then a„, j8« — as n -* «j (which means that we 
can completely distinguish between the hypotheses Ho and Hi). 

Hint. Write -^(1) = \x: T a (jt> s A T"! In ^-^ 3> 1 and 

r d L 

apply the law of large numbers to the statistic T„{X). Note also 
that we always have 6 ^ [7, p. 157] by Jensen's inequality. 
3.S2*. Let £ = (£i , . . . , £ r ) be a normal random vector distributed 
as -yf'(ji^ } , A), i = 0, 1, under the hypothesis H> (the common covari- 
ance matrix A is supposed to be non-singular). Construct a Neyman- 
Pearson test to distinguish the hypothesis H a against the alternative 
Hi by a single observation on £. Construct a test minimizing the sum 
of the probabilities of errors. 

Composite Hypotheses 

3.53. Given a binomial model Bi(k, 0), construct a uniformly most 
powerful test for verifying the hypothesis H : sj Q against the alter- 
native H\ : > 0t> for a sample of size «. 

I Hint. Use the property of a model with a monotone likelihood 

I ratio and the solution to Problem 3.39. 

3.54. Show that the Ney man- Pearson test constructed in Problem 
3.41 (for Poisson's model 11(0)) is a u.m.p. test for verifying the 
hypothesis Ho; s£ $o against the alternative Hi: > O . 

{Hint. Use the solution to Problem 3.53. 

3.55, Suppose that in a Bernoulli scheme the trials are carried out 
until the rth failure with an unknown probability of success 0, and 
T, is the observed number of successes. Construct a u.m.p. test for 
verifying the hypothesis H a : < 0a against the alternative H t : > $ 
and show that for r ~* oo the respective critical boundary at the 
significance level a has the form /„ = (rto — HaVftfoVO - 0o), 
*(««) = a. 

Hint. Use the property of a model with a monotone likelihood 

ratio, the representation T, = X \ + + X„ where X\ X r 

are independent and similarly distributed random variables, and 
S(Xi) ■■= Bi(\, 0), and apply the Central Limit Theorem. 

3.56, Show that the tests constructed in Problem 3.43 are u.m.p. 
tests for verifying the composite one-sided hypotheses Wo: 9 ^ 0a 
against H, : > O and H : > against H,: S < 8 Q . 

3.57. (Sampling inspection.) Suppose that a batch of N items con- 
tains an unknown number ot defective items, 6 (0, 1, . . ., N). 
In order to verify the hypothesis H a : ^ 0a against the alternative 
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H\.8> 0o. we test each of the n items chosen for control. Based on 
the statistic T (the number of defectives in the sample), construct a 
u.m.p. test. 

I Him. Make sure that the distribution of T (the hypergeometric 
I distribution H{8, N, n)) has a monotone likelihood ratio. 

3.58. Given a normal model ^(0, <r z ) with an unknown mean, con- 
struct a u.m.p. test for verifying the hypotheses Ho- < 6 a against 
Hr. 6> 6 a and H D : 8 > O against //,: fl < ft,. 

J Hint. Use the solution to Problem 3.47 and the properties of an 
I exponential model. 

3.59. Show that the test constructed in Problem 3.50 for the case 
ft, > 0i is a u.m.p. test for verifying a composite hypothesis Ha: 8 > do 
against the left-sided alternative Hr. < do and, similarly, the test for 
8a < 8, is a u m.p. tesi for verifying the hypothesis H : 8 ^ 8d against 
the right-sided alternative Hi: 8 > 8a. 

| Hint. Use the properties of an exponential model (see Sec. 3.5). 

3.60" Using Problems 3.47 and 3.58 and applying two one-sided 

critical regions, construct an unbiased test for verifying the hypothesis 

Ha: 8 = 8a about the mean against the two-sided alternative H,: 

8 ^ 0«, Is this test uniformly most powerful? 

3.<S1*. Let X = {Xi , . . . , X„) be a sample from the normal distribu- 
tion -. ■■''(#. 8*). Construct a u.m.p. unbiased test for verifying a simple 
hypothesis Hot 8 = 8a against the two-sided alternative H\; 8 & 8 a . 
I Hint. Apply Theorem 4,5 [7, p. 196] on the general form of a 
I u.m.p. unbiased test and use the solution to Problem 3.50. 

3.62, Given a sample of size n from the distribution F(0, 1), con- 
struct a u.m.p. unbiased test for verifying the hypothesis H : 8 — 6 
against the alternative H t : 8 ?* ft>. 

\ Hint. Use the solution to Problems 3.43 and 3,61, 

3.63. Given a iarge sample of size n, construct a local most powerful 
test for verifying the hypothesis Ha'. 8 — 8 against the common alter- 
native Hf.6 * #o for the model Bi(k, 8). Show that its power function 
fV„(8) satisfies the limiting relation 

lim W„<0<">) = * ( , ~ gV * - + «*/A + * (-=M= + u a/2 ) 
V/0q(1 - 8o) / V*o(l - 8o) / 

for the significance level a and local alternatives of the form 

9 = $<"> = ft, +■ pVvTT. 

I Hint. Use the general form 
^„ . (|t/(x; 8a)\> 
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for the asymptotic (at large n) two-sided test for regular models, 
where £/(x; 9) is the contribution function of the sample 
X = {X, , . . . , X a }, and /(0) is Fisher's information function 
[7, p. 199|. Use the solutions to Problems 3.39 and 3.40. 

3.64. Given a large sample of size «, construct a local most powerful 
test for verifying the hypothesis Ho: = O against the alternative H\ : 
6 & 60 for the model TI(6). Show that its power function W„(d) satisfies 
the limiting relation 

lit* WM»i «= * (- J. + u«„) * * (£ + uj) 

for the significance level 01 and local alternatives of the form 
$ = e< n > m So + j8/Vn. 

IWiitt Use the hint to Problem 3.63 and the solution to Problem 
3.41. 

Tests of Hypotheses and Confidence Estimation 

Problems 3.65-72 are based on the principle of correspondence be- 
tween confidence estimation problems and tests of hypotheses (see 
Sec. 3.6). 

3.65. Using the confidence intervals constructed in Problems 
2.119-120 for the parameters 0i and 2 in the normal model . 4^{9i, s|), 



construct the test for 


verifying the null hypotheses Ho against the alter 


natives H 


1 for the cases 






<1> Hoi 


Si 


— 0io» 


Hi: 


h 


> 0io; 


(2) H Q : 


0. 


= 010, 


Hi. 


0, 


< 0io; 


(3) H : 


0i 


■ 010, 


//.: 


0. 


9& 010, 


(4) H : 


02 


= 020, 


Hi. 


02 


> »io; 


(5) H : 


&2 


= 0so, 


Hi 


02 


< 020 ; 


(6) H : 


62 


= 020) 


H 


02 F 5 020- 



3.66. Using the solution to Problems 2.122-123, show that at a sig- 
nificance level a the hypothesis on the equality of the means of two 
normal models with known variances can be verified by the test 

.^i'o = l(t, y): \x - y\ > U\- a ,-2?lo\/n + a\/m\. 

If the variances are unknown, the test has the form 



**i* - [(x, y): \x - y\ 



nm{n + m - 2) 
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or 



3.67. Using the solution to Problem 2.125, show that the test 

(. n(m_~l) S z (x) ") 

C m(n - 1) W&> ) 

f . . aim - 1) S : (x) . „ M ") 

(^ wi(n * 1) S J (y) J 

can be used to verify the hypothesis that the variances of two normal 
models are equal. 

3.68. Under the conditions of Problem 2.127, construct a test for 
verifying the homogeneity hypothesis Ho', r = 61/61 = 1 (i-e-. $t = 0j) 
and calculate its power function. 

3.69. Using the confidence interval from Problem 2.128, construct 
a test for verifying the hypothesis Ho: d = do for the respective model. 
Calculate its power function and make sure that it is unbiased. 

3.70. Using the results of Problem 2.129, construct a test for verify- 
ing the hypothesis H : 6 - 0o for the uniform distribution /?(0, 6), 
calculate its power function, and make sure that it is unbiased. 

3.71. Using the results of Problem 2.130, show that the lest for 
verifying the hypothesis Ho: 9 = 8a for Weibull's model W{0, X, S) has 
the form 



.*„ - jV < 8 j x l.,^] u [r> f x.-^.]. 



f*l + ai = tt. 



In order to obtain an unbiased test, the quantities xt„in and xl-„ x ,i„ 
are chosen as in Problem 3.62. 

3.72. Using the condition of Problem 2.131 and its result, construct 
a test for verifying the hypothesis H a : (6\ , fij) =» (0, , dm}- 

Likelihood Ratio Test 

3.73*. Construct a likelihood ratio test for the hypothesis H : Si - flio 
for the mean of the normal model ..f (Si, 0|) and show that Tor large 
samples it has the form 

.-#,-„ = [x: s/n - \\x - B 1Q \/S{x) > -u a/2 ), 

and its power under (he alternative S\"> = 6 l0 + 0/Va is equal to 
1 - Ft{u\, 2 ; p/Sl) asd-» (see Sec 3.7). 

I Him. Use Problems 1.47 and 2.44, and the asymptotic theory for 
likelihood ratio tests (7, p. 212]. 
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3.74*. Show that the likelihood ratio test for the hypothesis Ho: 
Si - do For the variance of the normal model / (fl,, el) has the form 

.*,„ - [nS*{x)/9h> < xl,.«- i J U |nS 2 (x)/fl| 3= xJ- m .»-i I. 

where oei + an — a, and S z is the sample variance for a sample of 
si2C n. Compute the test's power function and find at, and oa for which 
the test is unbiased. 

\Hint. Use the fact that yi{n$ z (X)/9f) = x 2 (n - )> and also the 

I solution to Problem 3.61. 

3.75, Construct a likelihood ratio test for the hypothesis H a : 9 = Bo 
for the model Bi(\, (?) and show that its asymptotic (for large samples) 
variant coincides with the local most powerful test constructed in 
Problem 3.63 (for k = 1). 

I Hint. Use the general theory of likelihood ratio tests for a poly- 
nomial distribution [7, pp. 207-208]. 

3.76. Construct the likelihood ratio test for the hypothesis Hoi 
8 - S for the model ri(fl) and make sure that its asymptotic variant 
for large samples coincides with the local most powerful lest construct- 
ed in Problem 3.64. 

Hint. Use the fact that as n — => the limiting distributions of 
the statistics - 2 In \„ and Qg> = Ul(9 Q )/nHe ) coincide under 
the hypothesis Wo (7, p. 207]. 
3.77^ Let X\, . . . , Xi, be the sample means of the independent sam- 
ples with sizes n, n* from the populations Bi(\, 8,), ..,, 

Bi(l, 9k), respectively. Construct and calculate an asymptotic (as 
rti, .. ., «a- -> ») likelihood ratio test for the homogeneity hypoth- 
esis Ha: 0i - - ■ . = 9k. Show that the test is similar to the 
X z - homogeneity test [7, pp. 160-161). 

\Hint, Use the solution to Problem 3.75. 
3.78*. Let X, = (Xj,, . . ,, Xj„), j = l, . . ., k, be independent sam- 
ples from the populations Tl{0t) 11(0*), respectively. Construct 

and calculate an asymptotic (as ni fit -<■ oo) likelihood ratio test 

for the homogeneity hypothesis H : 9 t - ... = 6 t . Analyze the fol- 
lowing data: the sums of the' four samples of sizes 120, 100, 100, 125 
from Pois son's populations were respectively 251, 323, 180, 426. Can 
we infer that the_ general means are equal? 

3.79* Let nj, Xj, and Sj be the size, mean, and variance, respectively, 
of a sample from the population sV(9y, 6}), j = 1, . . . , Jfc (the samples 
ate assumed to be independent). Construct a likelihood ratio test for 
the homogeneity hypothesis Ha: 9u = ... = ffj*. Show that in the 
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case of two samples (k = 2) the test has the form ,*>i„ - 
!| 71 Js fi-*».-,*««-2l (compare with Problem 3.66), where 






T=(X l - Xi) ' 



mS% 



[Hint. Use Problems 2.86 and 2.114, and also the assertion 

\^*(T\H a ) m S(n - 2) 17, Theorem 1.12]. 

3.80*. Let S} S| be sample variances constructed from 

independent samples with sizes Jtt, .... «* from the populations 
-t '(*ij, &%j), j = l. •- ■» *. respectively. Construct a likelihood ratio 
lest for the hypothesis H : 6u = . . . = 02* that the variances are 
equal. Show that in the case of two samples (k = 2) the test has the 
form (compare with Problem 3.67) 

fifo => f/ r «F„,.n.-t.fl 1 -l)U(/ : 'S^"l-< lJ ,«,-l,-< ! -Lt. 

where at + «i = «, F= [n»(«j - l)5f]/[ni(iii - I)Sj]. 

| //Mr. Use the solution to Problem 3.79 and the assertion 
\-/\F\Ho) = S(n, - 1, «j - 1) [7, Theorem 1.13]. 

Various Problems 

3411. Assume that the observable random variables X\ Xn are 

independent and normal though, generally speaking, they have various 

distributions. We test the hypothesis H that Ihey are distributed similarly. 

Using Problem 1.58, show that the critical region at the significance level 

a can be defined as a£J« = iM > v a ), where v a is found from the 

. • t,{, 2 n-2 A 
beta distribution function by the relation B I 1 - v„; — x — . ^ 1 = *■ 

We can also use published tables for the beta distribution 

3.82. Let X, = (An , Xa), i = 1 n, be independent observations 

on the two-dimensional random variable £ = (£i , fc) distributed nor- 
mally with unknown parameters, and let q„ be the sample correlation 
coefficient constructed from these data. 

Using the results of Problem 1.59, show that the critical region 

m« = ) lei > / , ± 
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defines a test of significance level a for the hypothesis H that the 
components £i and fc are independent. 

3.83. Let the observable random variables Xi X„ be independ- 
ent and -/'(Xi) - il(ft), i=l, , . ., n. Using the results obtained in 
Problem 1.60, show that for large n the lest .#,„ = \\T„\ ^ -«„/*) 
can be applied to verify the homogeneity hypothesis H : 

fli = . . . =■ e„. 

3.84. What kind of a goodness of fit test can be constructed from 
the result of Problem 1.6]? 

3.85*. (Asymptotic efficiency of tests.) Let us verify a simple 
hypothesis M ; 9 = ft> against the alternative H%: 9 > ft) for a model 
with a scalar parameter 9 6 0, where 6 is an interval on a real axis. 
We use a test of the form ■<£{ = [T„ > 7 „J, where T„ is a statislic for 
a sample of siae n, which possesses the following properties: 

(a) there are functions /i(9) and a(8) > 0, with 

which are uniformly distributed in and lying in the interval 
ft> ^ Q < So + i, where 7; > is any number and n -> 00; 

(b) ii.($) is different] able at the point ft> and n'(ftj) > 0, while <j(0) 
is continuous at 9 . 

Prove that (1) given a significance level a, the critical boundary -y„ 
has the asymptotic form 

7" = M(ft>) - «<.a(9o)/Vn; 

(2) for close alternatives of the form e w = ft, + #/v/», j3 > 0, the 
power lVn(6 M ) of the test satisfies the limiting relation 

«(f3, «) a lim rF„(0 <n) ) - *(#* ' Cftj)/<rCflo> + «o-)l 

Remark. The quantity e = e(j8, ot> is called Pitman's efficiency of 
the test .^i„ = (.X"^ ^(e D ) - M„d(0o)/vn) and is frequently used as 
a measure tor comparing various tests. For large samples the measure 
e describes the local behaviour of the test's power curve in the neigh- 
bourhood of the point ft). 

3.86* (Continued from Problem 3.85.) Let Tp, j = 1, 2, be two 
statistics meeting the conditions formulated above. We will label their 
characteristics with the superscript j. We assume that for each n there 
is an integer N„ such that 

W)»(6 + 0/vn) = Wj^tft. + jS/vS), 
i.e., the powers of the tests are equal under the alternative 8 M if n 
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is the sample size in the first case, and N n is the sample size in the 
second case. Suppose also that N„ -» oo at n -» <a. Prove that 

v lim -1 frWY / f»ffl>Y _ ^ 
— At. V^ <&)/ / \ « («W / " e,' * 

where e' = (j* ' {6 )/v(9o)f. 

Remark. The quantity e' is an increasing function of Pitman's effi- 
ciency e(/3, a) for fixed fi and a and can serve as a measure of the 
asymptotic efficiency of a test. This means that the relative efficiency 
of the second test with respect to the first one is equal to the limit 
of the ratio of the first sample size to the second one. The sample 
sizes are chosen so that for the indicated alternatives <n> the test pow- 
ers are equal. 

3.87. We test the hypothesis /ft: = 6o against the alternative Hi'. 
& > Bo for the normal model *4'(6, o z ). Construct tests of the form 
■'/\ = \T„ ^ y„\ based on the statistics T$* <■ X (the sample mean) 
and Tg = Z„. ui (the sample median) and show that the relative effi- 
ciency of the second test with respect to the first one is X = 
2/nr - 0.637 .... 

\Hint. Use the solutions to Problems 3.86, 3.85, and 1.32, 

3.88* Suppose that we observe a random vector X = (X\ AV) 

distributed as S(X-) = „^(flt, E = |<jy[f), where 6 is an unknown scalar 
parameter, t = </j, . . ., /„), < fi < h < ... < /„, are known con- 
stants, and <?(, = u, i ^j. (If y(t), t ^ 0, is Wiener's process [2], i.e., 
a homogeneous random process with independent increments, and 
S(vU}) = - 4 \$ l > 0. then X, ■ tK'i), / = 1 n, i.e., X are the obser- 
vations on jj(/) at the moments ft, ..., f„.) 

Show that the last observation X„ is a sufficient statistic for and, 
using this fact, construct the tests for verifying the hypothesis /ft: 
6=0 that the process has no systematic trend (shift). Consider the 
alternatives H*: 6 > 0, Hf: < 0, and ffj : * 0. 

I Hints. (1) Use the factorization test and establish the equation 
t£ _1 = (0. ..01). 
(2) Use the solutions to Problems 3.47, 3.58, and 3.60, 



CHAPTER 4 



Linear Regression and 
the Least Squares Method 



4.1- A linear regression model implies that the observable random vari- 
ables X , X„ are "on average" linearly dependent on no n -random 

factors Z\, ■ ■ -,Zk,k < n, whose values may change from trial to trial. 
In this case the original statistical data are a set of the observed 
"responses" Jfi , . . . , X„ and the respective factors, i.e., have the form 
(xr, zY*, . . ., Zi: ). i = I, ..., «. We also assume that 

where = (#i , .... (3*) is the set of all unknown parameters called 
regression coefficients. If we introduce the random variables 
ei - x, — z (,1 #, which are called the measurement "-errors" and the 
plan matrix Z = l* <,) . . .z (n) | sized k x n, then we will obtain a matrix 

form 

X = Z'/S + C, X = (Jf,, . . „ X„), E - (« , &,), (4.1) 

of the linear regression model. Here E(£) = and il is usually assumed 
that the random variables are uncorrected and have the same variance, 
i.e., the matrix of the second moments of the observation vector X 
has the form 

O(X) = D(e) = Eee' = a 1 E„. (4.2) 

The quantity a 2 is called the residual variance which is usually also 
unknown. If the non-random variables have the form Zj = t>j(t), where 
flj-tf) is a polynomial, we have parabolic regression. 

The case of k = 2 is frequently applied. Here the vectors i (,) are 
of the form z (f> = (1, r ( ), i.t, EAj = &, + g»l», i - I, . . . , n (the aver- 
age number of observations is a linear function of only the factor 
0- This is a simple regression model, where the straight line 
(a(f) = 0i + /3i/ is called a regression line, and the coefficient 02 is 
its stope. 
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Some problems in regression analysis are solved under an additional 
assumption about the distribution of the errors e and they are mostly 
taken to be normally distributed as S{e) = •sfifb, o 2 E n ). In this case 
the model has the form 

jf(X) =^{Z'|8, o a E„) (4.3) 

and is called the normal regression. 

When seeking the vector /S = (&, . . ,, £*) of unknown parameters, 
we use a linear regression model. We cannot measure the unknown 
parameters directly and only define some functions of them. Restoring 
a functional dependence belongs to this kind of problem. The expan- 
sion coefficients of the restored function, given a certain system of 
functions, are then the unknown parameters. 

4,2. Regression analysis mostly deals with estimation of the 
unknown parameters p = {0 lt . . ., £*,) and a 2 of the model (4.1-2) 
or, in the case of the normal regression (4.3), with their confidence 
estimation and testing the hypotheses about the parameters. 

A general technique for estimating unknown regression coefficients 
6 is the least squares method. The estimates are found from the condi- 
tion that the quadratic form 

S(0) = S(X; 0) = (X - Z'0)'(X - Z'(8) (4.4) 

is minimized. The point = (j3i, .... 0k> which satisfies the equation 
SOS) = min S(fi) is called the least squares estimate (l.sje.) for the 

B 
parameter 0. 

The matrix A = ZZ ' is fundamental to these problems. We will as- 
sume that the matrix is non-singular (or, which is equivalent, 
rank Z = k). Then the l.s.e, is uniquely defined by the normal equation 
A.0 = V a ZX and has the form |S = A~ l Y = A~ 'ZX, The estimate 
p is unbiased (E/3 = 0) and has a minimal variance (i.e., the variances 
of all the components of the vector are minimal) in the class of 
all linear (i.e., linearly dependent on the observations X) unbiased esti- 
mates for 0. Moreover, any function t ss T0 possessing these proper- 
ties is an estimate for the parameter t = Tft where T is a given m x k 
matrix. Here D(t) =. cr 2 TA _, T' and, specifically, D(0) = <j 2 A~ l . 

The statistic 

S 2 = — — S(0} = — !— X'BX, B = E„ - Z'A-'Z (4.5) 
n — k n — k 

is an unbiased estimate for the residual variance cr 1 [7, pp. 223-226], 
In the interpolation problems dealing with an unknown function 
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X = /(f) which relates the variables t and x by the observations 
lb, Xi m X + ed. x, = J\td, i = 1, . . ., n, we seek an interpolation 
polynomial of the form 

Jul 

where CAe/JV5Aev"J orthogonal polynomials are used for «(f), 

ai (0 The LmIs for the unknown coefficients ft are calculated 

from the formulas 






where the quantity S(0) = S JT? - 2 «J0/ defines the approxima- 
tion accuracy. The first three Chebyshev's polynomials are of the form 

(/) s 1, a z (/) = i - 7, oj(fl = (f - 7) U-7 - ^) - *■ 



where 



(see [7, pp. 231-234]). 

The least squares method is also applied when the dependence of 
EX, on is not linear. Suppose that 

X, = /(/,, Pi, . . -, M + $,, i ■» 1. ■■.. n. 

where Ee,- = 0, De ; = a 2 , cov (e/, eft - 0. i ?*y. 

Then the l.s.e, j8 for the parameter minimizes the expression 

QW - Z C* -Au.B\ A»* 

■ - L 

with respect to 0. 

Thus, is a solution to the system 

^(0)=O, /=1. ...,*. 

We now calculate the estimates for 0. Let us seek the unknown 
coefficients (fit, 0a, 0j) of the functional dependence 

X(fl = 01 * fi& + r J 0j. 
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We will assume that the values of the function x(t) have been found 

at the points // = 2 + 3i/n, i = 1 , i*. We also assume that the 

measurement errors e„ i = 1, .... «, are independent and normally 
distributed with Ee ( = 0, Dei - o 2 . We then obtain a linear model 

X, = *(&} + e, = ft + ftfii + dfe + e„ i : = 1, ■-.«, 

and the estimates ft, ft, h satisfy the system of equations 

S w& - ft - *A - *$*) = °- 

2 (Xi - ft - fcft - (Pa>ft = 0. t 4 - 7 ) 

Ut ft = 3, ft = - 1, ft = 1. <•* = O.M. We simulate & for n = 25 
n = 100 and find the respective X. System (4,7) gives 

(1) * = 25: 4. - 2583. ft = -0.828, Bi = 0.895, 5 2 m 0.034. 
<2) n *. 100: 0, = 2.992. ft = -1.007, ft = t.001, 5 1 = 0.046. 
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Figures 5 and 6 give the exact curves of the function x(l) = £i + 
Pi( + Pat* for « = 25 and /i = 100, respectively. The sign O labels 
the measurement results for Ut, X,). The curves for Jf(f) = j3i + 
fti* + &3 i 1 are also shown. 

4.3. In the normal regression scheme (4.3) the I-SjcIs & coincide with 
the maximum likelihood estimates (m.l.els) for the parameters 0. The 
■y-con fide nee interval for the parameter 0j has the form 



(' 



Pj * hi +y»/2. 



M^> 



(4.8) 



For the residu- 



where aP is the jib diagonal element of the matrix A " '. 

al variance we have 

S0Vxawt-.»-k < <** < S&)/xfl -■&*.*-*■ <*•») 

The -y-confidence region for the vector t = Tft where T is a given 
m x k matrix with rank T = m, is constructed from the formula 



■ ^■(X) 



"{' 



t: (T/3 - t>'D-'(T/S - t) < 



S(J3)F T 



..-.}. (4- 



10) 



where D = TA"'T [7, pp. 237-238]. 
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If we have to estimate simultaneously some linear combinations of 

the parameters 0, i.e., the quantities A,'p\ r = I m, where A r are 

given vectors, then the system of the joint confidence intervals with 
a confidence level equal to or greater than y, has the form 

\;$ - w,(X; jy < k& < K'0 + w v (X; Kh r = 1, .... m, (4.11) 

where 

(MX; A) = [^i S ^.''-*' A "' 1 'j 

{see [7, p. 24]]). 

Finally, to test a linear hypothesis of the form Ha'- P e Bo - 
\ff: T0 = <ol , where T is a given m x k matrix with rank T = m. and 
to is a given vector, we use an F-lest with a critical region of the form 

where S T = min S(ff) is a conditional (under the hypothesis Ho) 

fi.rg = i . 
minimum of S(jS) [7. pp. 242-243], 



Problems 

4.1. Given a linear model (4.1) for * = 2, write an explicit expression 
for the Ls*. (ft, &) through (AT,, .... X„) and «« - (af, zj ), ' - 

1 n. 

4.2. Given a simple regression model 

X1-&1 + 6*4 + w, i'=i. .-., «. 

find an explicit form for (jit* &>, check whether they are unbiased, 
and find the condition for them to be consistent. 

43. Calculate the estimate d 1 (see (4.5)) for the residual variance 
a 1 in Problem 4.2. Find the sufficient condition for it to be consistent. 

4.4. Find cov (0, , ftj for the estimates ft and 182 from Problem 4,2, 

4.5. The values of the function x{f) - ft + &/ + ftf* have been 
measured at the points fc, *' = 1, ■--. «, >■&. 

Xi = 0i + feft + Pa*' + «f. Eei - °« De/ = <7 . 

Find (1) the l.s.eJs ft, ft, 18] for the parameters 0i, ffz, fai (2) Eft, 
Dft, f = I, 2, 3; cov (ft. ft). 
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4.6. Is the statistic 



'" I 



b 

x{t) dt 



b 
an unbiased estimate for the integral / = t *(/) dt, where M<) = 

a 

Bi + fat + 03 t z , and x(t\ 0i, 0Xt 03 are defined as in Problem 4,5? 
Find D/. 

4.7. Simulate the observations Xi = &i + 0tU + e t , i = 1 n. 

if n = 100, d = li/n, $\ = 2, fo = 1, and «, are independent random 
variables uniformly distributed on the segment [ — 1.386, 1.386]. Plot 
the functions x{() = 2 + f and x{t) — 0i +■ 2 t on the segment [0, 2], 
Mark the points (/j, A/), i = 1, . . . , n. 

4.8. Solve Problem 4.7 for normally distributed e,- with Ee* = 0, 
Osi = 0.16. 

4.9. Construct a -^-confidence interval as in Problem 4.8 for the 
parameters (3|, fe, and <r 2 (see (4.8), (4.9)) and a y confidence ellipse 
.&,<0) (see (4.10)) for the vector & = </3i, flj). use 7 = 0.9 and 
T = 0.95. 

\Hint. Use the solutions to Problems 4.2 and 4.3. 

4.10. Using independent measurements at the same points LY Jf //), 
i = I. . . ., n, of a lineaT function x(0 = 0i ■+■ Sit (measurement errors 
are normally distributed as^'(0, cr 2 ) with an unknown variance), con- 
struct a confidence interval for the integral of this function on the 
segment — a € ( < Q (o is given). Carry out the calculations for the 
data (2.96, -2), (3.20, - 1), (3.41, 0), (3.63, 1), (3.79, 2) for a = 2 
and a confidence level y = 0.95. 

4.11. A point is moving uniformly in a straight line. The values of 
the coordinate a(r) at the moments t = 1, 2, "3, 4, 5 are 12.98, 13.05, 
13.32, 14.22, 13.97, respectively. Assuming that the measurement er- 
rors are independent and normally distributed as ^{0, cr 2 ), construct 
a 95% confidence ellipse for the point (a(0), »), where v is the speed 
of the point. 

4.12. Simulate the observations X t m 0, + foti + 03 tf + «, i = I, 
. . ., n, with j8i = -8, 02 = 10, 03 ~ -2, n = 100, U = 1 + 2i/n, 
where f, are independent random variables uniformly distributed on 
the segment [-1.386, 1.386], Plot the functions x(t) = 0i + fat + 
^t 1 and x{t) = $i + &/ + fat* on the segment [1, 3], Mark the 
points (/,-, Xi), /=! n. 

4.13. Solve Problem 4.12 for normally distributed e,- with EEi == 0, 
De, - 0.16. 

8— 880 
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4.14. In the previous problem construct 7-confidence intervals for 
the parameters 0„ &. S 5 , and <r* (see (4.8-9)) and a system of joint 
confidence intervals for the level equal to or greater than 7 for 0i, 
ft., ft (see (4.11)). 

\ttint. Use the solution to Problem 4.5. 

4.15. In a quadrangle ABCD the results of independent measure- 
ments at the same points of the angles A BD, DBC, ABC, BCD, CDB, 
BDA, CDA, DAB (in degrees) are 50.78. 30.25, 78.29, 99.57, 50.42, 
40.59, 88.87, 89.86. Assuming that the measurement errors are normal- 
ly distributed as ^(0, o\ find the l-s.cts for the angles ft = ABD, 
p 2 = DBC, ft = CDB, £4 m BDA. Construct a 95<7o confidence inter- 
val for a 2 . 

4.1S*. Prove that the l.s.e. is an optimum estimate for ff in the 
class of at! linear (i£., linearly dependent on X) unbiased estimates 
for 8 (i.e., the variances DA are minimal for all i). Show that 

D(0) = o 2 A-' = o 2 \a iJ \ and £ »& = ** tr A " ' * a * 2 V'#*< 

r_i /-] 

are the eigenvalues of the matrix A. Derive the consistency condition 

of the estimate ft: min X,- -» °° as n — «>. 

4.17. Prove that S 1 is an unbiased estimate for the residual variance 
a 2 . Obtain an explicit form for the dependence of a 2 on X from formu- 
la (4.5). Obtain the formulas: E{U) = 0. D(U) = cr^B, cov (U, $) = 0, 
where U = X - Z' = BX. 

Hint. Use the expansion S<0) = S0) + (& - 0)'A(0 - S), the 
formulas cov (ft, ft) = ° 2 °" (Problem 4.16) and = 
+ A-'Ze, U = Be. 

4.18. Suppose that the plan matrix Z has orthogonal rows. Find 
the l.s.t's ft , . ... 0k and their second moments. 

4,1°* We have *■ items with unknown weights 0\, . . ., 0k. In order 
to find the weights, we weigh items in combinations. Each operation 
consists of putting a few items on one pan and a few items on the 
other, which is then balanced with an additional weight. We obtain 
the relations 

spft + . . . + z$&k = yi 

(for the fth weighing, i =, 1, .... n), where z^ = I, -1,0 (depending 
on whether the yth item is on the left pan, on the right pan, or not 
weighed), and y t is the additional weight. Assuming that the measure- 
ment errors are independent and normally distributed as --/'"(0, a 2 ), 
estimate the weights of four items using the following table for eight 
weighings: 
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fil 


I 


1 


1 


i 


I 


1 


I 


1 


ft 


t 


-1 


1 


-1 


] 


-1 


1 


- 1 


fc 


i 


1 


-1 


-1 


1 


1 


-1 


-1 


&i 


] 


-1 


-i 


1 


1 


-1 


-1 


1 


Weight 


20.2 


3.1 


9.7 


1.9 


19.9 


8.3 


1Q.2 


1.8 



Find the covariance matrix for the estimates, and the estimate for a . 
Compare the precision of these estimates with that of the estimates 
obtained by weighing every item a few times and finding the arithmetic 
mean of these values. 

\HinL Use the solution to the previous problem. 
4.2ft. For the data of Problem 4.19 construct a system of joint confi- 
dence intervals for fi%, . . ,, 04 with a significance level > 0.95. 

4.21. Find the maximum likelihood estimates for the parameters 
and <r 2 of the normal regression in (4.3) and calculate their biases. 

4.22. Show that the 7-confidence interval for an arbitrary linear 

combination X'ff- 2 N/& of the normal regression coefficients in 
(4.3) has the form 

(h. 'ft * i a + -m.n- * JjzTji s< * c3k ' A " lx) ) ■ 

4.23. Construct a -.-confidence interval for the ordinate 
<p(t} = &i + fat of a regression line at an arbitrary point f (the model 
is assumed to be normal). Make calculations for the data of Prob- 
lem 4.8 for / = 1.5 and y = 0.95. 

J Hint. Use the solutions to Problems 4.2-4 and 4.22. 

4.24. Verify that the intervals 

Ok - Ar ± U-r-fr so&y? v .*, „-*<*." - 2^ + «*)T "\ , 

I < j < < < k. 

constitute a system of joint confidence intervals at a level greater than 
or equal to y for the differences &t — &j, i > /. 

4.25. Construct a system of joint confidence intervals for the mean 
values of all the observations X\ , . . ., X„ in the normal regression 
model. 
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4.26*. Let T be a given m x k matrix (m ^ k) with rank T = /m, 
and let to be a given w-dimcnsional vector such that the system 
Tff = to is compatible. We write S T = min S(ff} and call the 

value of & for which S T = 5(0 T ) the generalized l.s.e, fi T . Prove that 

£r = - A-'T'D-Ht^ - to), 

where the matrix D = TA " 'T' is positive definite. Find the expansion 

St - S(i) + (TVS - to)'D-'(T| - to). 

4,27. Show that the test of a significance level a for verifying the 
hypothesis Ho: & = &ia which fixes the slope of the regression line 
(in a normal model) is defined by the critical region 



"«•- [l& - &„| 5= fi_«« t)f -i IsmfUn - 2) S (r, - 2 | |- 

|/f/n/. Use the solutions to Problems 4.2-3 and relation (4.12). 

4.2*. Given the data of Problem 4.8, find the values of the sig- 
nificance level ct for which the hypothesis Ho'. 02 = 1.2 should be re- 
jected. 

4.29. The values of independent random variables Jt}' 7 , i = t, 2, 3, 
4; y = 1, 2, are given in the following table; 



V I 
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3 
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rv 










i 


8.67 


9.71 


10.16 


13.65 


2 


10.03 


1023 


926 


13.79 



Assuming that ^(.Vf*) = ^'(^, o- 1 ) (all the parameters are unknown), 
construct the estimates for m, /iz, w. im, and a 2 and test the 
homogeneity hypothesis Ho: pi = Pi. = t*i = V* (take the significance 
level 0.1). 
4.30. Construct an interpolation polynomial of the form 

k 

<pk{(, 0) - 2 BjoAO for fr = 2 and 3, where ej(t) are Chebyshev's 
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polynomials (see (4,6)), using the data for the unknown function 
x = /<() from the following table: 



l{ 


0.4O 


0.52 


0.61 


0.70 


0.79 


0.86 


0.89 


0.95 


0.99 


X, 


0.39 


0.50 


0.57 


0.65 


0,71 


076 


0.78 


0.81 


0.84 



How docs the precision of the interpolation change when we go fro* 
k = 2 to k = 3? 
4.31. Prove a similar problem for the data 



4 10 15 21 29 36 51 68 

66.7 71.0 76.3 80.6 85.7 92.9 99.4 113.6 125! 



4.32. Compute the fourth polynomial in a system of Chebyshev's 
orthogonal polynomials. 

Hint. Use the recurrence relation 

a I+ ,(/) = it + a)o,(r> + 0a r -,(t), 
where 

<x = -S Ualitd/al, Sb-2 t i a,- l {ti)a,{t t )/a'i-i 

i - L i- 1 

(see (4.6)). 

433. Simulate the observations X t =» t?+ e<, I m. 1, .... n, if 
n = 100, /* = 2 + 0.1((" - 1), and e, are independent random variables 
uniformly distributed on the segment [0, 0.7]. 

(1) Construct an interpolation polynomial <fik(t; £) = Jj ftffj(f) for 

>= i 
* = 2, 3, 4, where a,-(f), j ~ I, .. ., 4, are Chebyshev's polynomials. 

(2) Plot the functions x m t 1 , x = ys*(r; 4), * = 2, 3, 4. 

(3) How does the precision of the interpolation change with the 
growth of kl 

4,34. Solve Problem 4.33 for normally distributed e; with Ee, = 0, 
DC; = 0.04. 
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435. Solve Problem 4.33 with Xi = e' J 4- e,-. 
4.36- Solve Problem 4,35 for normally distributed e, with E& = 0, 
Dej m 0,04. 

4.37. Let Y\ , . . . , Y„ be independent random variables with a com- 
mon distribution function F$((x — ft)/ft), where fo(x) is a known 
continuous distribution function, and the shift parameter ft and the 
scale parameter /9 1 > are unknown. Then Yj = ft + ftt/j, where the 
random variables Ui, . - ., V„ are independent and their distribution 
function is F a {x). We write Y^ = ft + a/ft + fte/ for the respective 
order statistics, where ej ■ f/yj — oy, oy = EE/qj, j = 1, ...,«. Find 
the estimates for the parameters ft , ft using the least squares method. 

\Hint. Here the random variables Y = {Y w K (n) ) satisfy a 

model of linear regression with correlated observations, i*., 
cov ifij, ej) ■= cov (t/(j), !/(/)) s £y are known, and we may go to 
the uncorrelated variables X = G " i/I Y, where the matrix 
G = |g(/|" is assumed to be non-singular. 



CHAPTER 5 



Decision Functions 



5.1. Suppose lhat we are given a sample space St— \x\ of the values 
of the observable random variable A and a function 6(x) on it whose 
values are in the set D = {d} of possible decisions made from an ob- 
servation on some value of X. In this case 5(x) is called a decision 
function {rule, procedure). Suppose also that -AA) € &~= [F(x-, 6), 
$ 6 © ) and for every pair (6, d) e e x D a number L(Q, d) J: is de- 
fined. This number is the loss due to making the decision d when 
Xis distributed as F(x; &). Then L(B, d) is said to be a toss function. 
In point estimation, for example, decisions are estimates for the 
parameter $, and therefore the decision set D usually coincides with 
the parametric set ©, the decision function S is called the estimate, 
and the loss L(9, d) is the difference between the value of 6 and the 
estimate of d. As a rule, such problems imply that 
L(0, d) = us(\$ ~ d\), where w is a strictly monotone error function 
Id - 9\. 

The quantity #(0. 5) = E»Z.{0, 5(A)) is called a risk function of the 
procedure 6 and characterizes the average loss due to the application 
of the decision rule 5 when the observable random variable A" is dis- 
tributed as F{x; ff). If the condition 

me, s') <*(*, 8) v««e (s.i> 

is met for the two rules 6' and $, the strict inequality holding for 
at least one 8, then the rule 6' is preferable. The rule 6 is then called 
inadmissible. A decision rule which is not inadmissible (there is no 
preferable rule) is called admissible. In practice we restrict ourselves 
to the cfass of admissible decision rules no two of which can be com- 
pared in the sense of (5.1). In order to choose the best decision rule 
among the admissible ones, we use the Bayes and minimax ap- 
proaches. 

5.2. The Bayes approach implies that the parameter $ is a random 
variable with some {a priori) distribution ~S{8) given by the distribu- 
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tion density irffl) (or probability in the discrete case). We may calculate 
the total average loss due to using the rule 8, i,c, 

fin the discrete case we have 2^(fl ( . S)t(0i) J , which is called the 

Bayes risk, All decision rules can be ordered with respect to this value. 
The procedure 5* which minimizes the Bayes risk r(S) is an optima] 
rule called the Bayes solution. 

We now suggest an algorithm to find the Bayes solution for a given 
a priori distribution of the parameter t(0) [7, pp. 270-271 ]r 

(a) We seek an a posteriori distribution tt(8]x) for X = x using 
the formula *■((?(.>£) = f{x\ 0)ir(6>)//(x), where /(*) = EJ(x; fl) = 



JAxi ff)ir(0)d8 for 2j/(v, fc)ir(ft) it* the discrete easel . 

(b) We calculate the average loss for the solution d with respec 
the a posteriori distribution, viz., 

E(L(tf, d)\x) = JL(0, d)ir(e\x)d8 (or ^/.(fc, d)Tr(0i}x)\ 



(c) We choose the solution d* - 8*(x) for which the average loss 
is minimal. 

5.3. When the a priori information about 6 is absent, we use the 
maxima! risk m(6) = sup Jt(6, 8) to compare the admissible decision 

rules. The rule 5 minimizing m($) is considered to be the best one 
and is called a minimax decision rule. In some cases this rule can be 
constructed if there is an a priori distribution of the parameter 
tt(0) > for which the risk function of the respective Bayes rule 8* 
is constant, i.e., R(6, 6*) = const (this distribution of tt is called the 
least favourable a priori distribution) and then fi = 6* [7, p. 271 J. 

5.4, We now turn to an important special case when 8 = 
\0i, .... ttt\, i.e., when only a finite number k of the distributions 
Fiix) m FQc, 8,), i = 1, . , . , k, arc possible for the observable random 
variable X, and, given an observation on X. we must choose one of 
them as the true distribution. 

Problems of this kind are called classification problems. Here the 
set of possible decisions is D = \di, . . ., d*\, where di implies that 
the distribution Ft, i = 1, . . . , k, should be chosen as the true distribu- 
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tion, and every decision rule S(*) generates a partition of the sample 
space &= WiU... UW*, WilMVj-®, i * j, where W t = 
\x. 5(x> = di ), i = 1, . . . , A, The Bayes solution £* is defined by the parti- 
tion <?"= tV" t U ...UW%, where Wj = I*: A,(jc) = min h,(x}\, 

j = I, .... k, and 

j - L 

(if the minimum is attained for a few values of y, then we choose the 
smallest of them for the subscript !). If the loss is l<J\i) = 1, j * i, 
or is unknown, or cannot be estimated by a number, then the Bayes 
rule is replaced by the principle of maximum a posteriori probability, 
which requires that the object with the observation x is placed in the 

class whose a posteriori probability td(x} = fi(x)*i I 2 T «/»<*)> 

/ — 1, . . . , k, is maximal. In such cases (7, p. 276] we have 

Wi= [x: vifiix) = max it*// (■*)!» i = 1, ..-, k. 

In order to construct a minim ax solution S, we seek the least 
favourable a priori distribution x = (in , . . . , ir*) from the condition 
that the components of the risk vector R(6*) = (Ri(6*), - - ■ , JM6*)) 
of the respective Baycs solution are equal, where 



Problems 

5.1. Ut-xTCX) = W(l, fl>, e = (fli = 1/3, fc = 2/3), the decision set 
be D ■ \di, d 2 \, and let the loss function L{6i, dj) be given by the 
table 



9i 






) 


2 
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(1) Find she admissible decision rules and the mini max solution 
among them. 

(2) Find a Bayes solution ** for an arbitrary a priori distribution 
of x(fli) = a, w(ez) = 1 - <*, a 6 [0, IJ, and plot the Bayes risk 
eiot) m r{6*) as a function of a, 

5.2. Find all Bayes solutions for ^T(X) = fl/fl, 0), = (0 L = 1/4, 
Si = 3/4J, D - idi, di, d\\, and let the loss function Lffli, dj) be 
given by the table 



f), 



d, 



di 






1 


1/2 


4 





1/2 



Plot the Bayes risk e (a) = /■(**) as a function of a = ir(0i), a e [0, 1J. 
iNinl. Compare the average loss with respect to the a posteriori 
J distribution given in the solution to Problem 5.1 (2). 
5.3. Let W(X) m Bl(l, t?), = (fl,, & z \, D = \d lt d 2 ], and let the 
loss function L(8i, dj) be given by the table 



0z 



d> 



a 

b 





for a, b > 0. Consider the cases of t?i = 2/3, H = 1/2 and »j = 3/4, 
0j =. 1/2. 

Show that in both cases the sets of admissible decision rules coin- 
cide, but in the second case the Bayes solution is preferable for any 
a priori distribution of the parameter.. 

| Hint, Use the solution to Problem 5.1. 

5.4. Suppose that -f(X) = Bi{3, ff), B = (0i = 10 " s , 62 - 10 -1 ]. 
the decision set is D = [di.di], and the loss function £(&, dj) is given 
by the table 
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* 






2 


1 






8* 



Consider the decision rules fc = (S;(0), fi ; (i) f 5 : (2), fii(3)), where 
S, = (di, d 2 , d z , <h). Si m {di, di, dz. dz), «> = (d,, d,. d,, d 2 }. 

Show that the rules cannot be compared and find the mini max solu- 
tion among them. 

S.S. Suppose that ^(^0 = »/(l. &), e = [9,,9ti,D « \d,,di\, and 
the loss function is given by the table 



6, 



rfj 



* 



a 

b 



Find the mini max decision function among the functions 
for x = 0, 1 T .... I — 1, 
for x ■= i, i + I, . . ., 



M*) 






/ - 1, 2, 



5.6. Show that if in the previous problem we cake the Poisson law 
n(0) instead of -^(X), then- for a(l — e" B| ) ^de - ' 1 we will have 
5 = St , and the respective risk vector will be (a(l — e - '). ie " J ). 

5.7. Let Jf be a random variable distributed either as F,(x) = 
F(x-. 9i) or as F 2 (x) = Fte 2 ). Let the decision set be D = jrfi, </:), 
and let the loss function be given by the table 



A 



», 

62 



a 

b 
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Construct a Bayes solution for a given a priori distribution of 
(*i, Ti> and calculate the respective risk (compare with Problem 3.51>. 
Consider the case when Fi is the normal distribution ^{9i, a\ i = 
I, 2. 

5.8*. Lei the observable random variable X be distributed normally 
with an unknown mean S and known variance <7 2 , the decision set 
be D = |di, di, d 3 \, and let the loss function L{6, d) be given by 
the table 



\ 


rf, 


di 


dj 


< 





t 


2 


■ 


1 





1 


> 


2 


1 






Consider the decision functions 
*<>(.<*) = < d 2 



for x < a, 
for a ^ x < b, 
for x > b t 



where a < < b. 

Show that the risk function has the form 



(a/a) + $(-*/*) 



/J(fl. «.*) = I *<0/ff) + *(-ftAr) 



for < 0, 
for » = 0, 

for 9 > 



and plot it for fi = — a. 

5.9. Suppose that e = {0, l).D = [d\ = [0, IJ, and the loss func- 
tion is HB, d) - \0 - d\", a ^ 1. Consider a class of decision func- 
tions of the form o(x) s const (ije., the solution is found without 
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preliminary observations). Find in this class a Bayes solution for the 
a priori distribution of the parameter ir(0) = a, ir(l) = l - a, 
a€tO, IJ. 

5.10*. (1) Show that for the; risk of the Bayes solution in the classifi- 
cation problem (see Sec. 5.4) the representation 



r(5") = ( mm hj(x)dx 



is true. (For a discrete random variable below all the integrals are 
replaced by the respective sums.) 
(2) Introduce the variables 

/« = J* min frtfiffi, x/fAx)) dx, 
1= max /(/1 0.1= min l(J\0 

and prove the following estimates for r(5*): 
v. k 

) 2 max hi ^ r(i") < J £ *«• 

r~2 J<' t SJX'S* 

In what case do the estimates coincide? 
Hint. Use the identity 

* k 

S « = max °y + £ m ' n ( flj > max ^j) 

i o l 1 ^ j < k t-i y < I 

<use the induction on k in the proof) - 
5.11*. (Continued from Problem 5.10.) Let /=i(x) be the distribution 
function of an r-variate non -degenerate normal law . V{)x il} , A), i = 
1, 2. Prove the formula 

where g = (*i (l) - (* e V A~ l ((» <l> - ft liy ) is the Mahalanobis distance 
between the distributions .-/ r Ot (1) , A) and^(fi <2) , A). Derive a similar 
formula for /ij in the case of two Poisson's distributions. 

5.12*. Construct the Bayes and minimax solutions for the classifica- 
tion problem with the two normal distributions given in the previous 
problem (compare with Problem 3.52). 
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5.13* Let X = (Jfi, .,., Xn) be a sample from the distribution 
-*"(£> e5=~= [F(x: 0), 6 0], and let the a priori distribution of the 
parameter be ^f\8) e5~. The family &* of the distributions of the 
parameter is said to be conjugate to .^(denoted ,5"" o 5"") if for X = x 
the a posteriori distribution is ^f(d\x) GS™. 

Show that the following assertions are true (in what follows 

xm £*) s 

(1) B(a, b) < B£m. ff) with -if(S\it} = B(a + x, b + nm ~ x ). 

(2) B(a, b) < Bi(r, 0) with -^{&\x) = B(a + x, b + nr). 

(3) r(ar, X) < U{8) with _^(*[x) = T ( — X + x \ 

\«o + 1 / 

(4) r{a. X) < r(fl - ' , i) with ^(e\x) = r (— ^— . x +■ n\ 

\ax +1 } 

(5) n(o, ct) < Rtfl, 6), where Pareta's distribution IT(a, a) is defined 
by the density tt{8) = aa"/8 a + i , 8 $> a (a, a > 0), with ^(e|x) = 
IT (max (a, x,, .... jf n ), a + n). 

(6) Z3(«r) < Af(n; $ = (#,, ,.., flfV ) )p wher e Dirichiet's distribution 
D(a), a = («,, . . ., o^,), o(j > 0, / = 1. . . ., N, is defined by the 
density 

with -^(fl|h = (A,, . , ,. M) = ZJ(c + h). 

(7) ^^ ff *)<t^(fl, ft*) with _^(fl|x) = .-*Tju. erf), where 



/'i 



«&•*)■«- cw- 



///«, It is sufficient to calculate the density of any distribution 
up to the normalizing factor and therefore, using the notation 
ft(t} = cp(t) = />(/) for any random variable ( (here the constant 
c is defined by the condition c\p(t)dt = 1) when finding the a 
posteriori density T (0|x) = /(x; e)w(6)/f(x), wc may restrict our- 
selves to calculating the numerator /(x; 8)tt(8}. We must act simi- 
larly when calculating the densities ir{8) and fix; 8) r 
5.14. Let us estimate the unknown probability 8 of success from 
the given number of successes X in n Bernoulli trials (here 
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= D = (0, I)). Suppose that the loss Function has the form 

and the a priori distribution of the parameter S(Q) - R(fl, 1). Prove 
that the Bayes solution is £• (*) — x/n and that it is a minimax solution 
with r(5*) = 1/n. 

5.15. (Continued from Problem 5.14.) Find a Bayes solution 5* for 
the case when the loss function is Lifi, d) = {d - 0) 2 and the a priori 
distribution is -S%V) = B{a, b). For what values of the parameters a 
and b is 5* a minimax solution? (Compare with Problem 2.6.) 

' ]Mint. Use the solution to Problem 5.13 (1). 

5.16. Consider the point estimation of a scalar parameter 8 from 
the point of view of the decision theory when the decision set D coin- 
cides with the parametric set © and the decision d e D is the estimate 
of the parameter 6 e ©. For a loss function of the form 
L($, d) = {d - Of, the risk function R{8, *> = E e {S(X) - 8f is the 
mean square error of the estimate 6{X)> Prove that the Bayes solution 
(Bayes estimate) fi*(jr) for the observation X = x is 

6*{x) = E(S|jf) s jflir(fl|jc)<ra, 

i&, coincides wilh the a posteriori mean of the parameter, and the 
respective risk r(6 m ) = ET>(8\X), where 

D(0|x) = E«e - a*o»V) = J(« - 5*(jr)) 1 T (e|j()rffl 

is the variance of the a posteriori distribution of the parameter. The 
mathematical expectation is calculated with respect to the density /(jf) 
(probability in the discrete case). 1 1 is assumed here that all the respec- 
tive moments dD exist. 
Apply the result to solve Problem 5.15. 

5.17. Suppose that Bernoulli's trials are carried out until the rth 
failure, and X is the number of successes in these trials. Given an ob- 
servation on X, construct a Bayes estimate For the unknown probabili- 
ty $ of success when the loss function is L($, d) = (d - 0) 1 , and the 
a priori distribution is ^*(8) = B(a, b). 

\Hint. Use the solutions to Problems 5.16 and 5.13 (2), 

5.18. Given a sample X = {Xi , . . . , X„) from Poisson's distribution 
n(0), construct a Bayes estimate for the parameter if the loss function 
is L(8,d) = id - 8) z , and the a priori distribution is -^(8) = T{a, X). 
Calculate the risk of this estimate and define the optimum sample 
size if the price of one observation is c > (i.e., the size minimizing 
the total loss r(_5 m } + en). 
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\Hint. Use the solutions to Problems 5.16, 5.13 (3), and 1.39 (4). 
5.19. (Continued from Problem 5.18.) Show that given the Joss func- 

tion L(6.d) = (d - ef/8, the Bayes estimate for X + _: Jf, > I has 

i-i 
the form 

a 



and its risk is 



/■(*•) = 



na + 1 



\Hint, When calculating the moments, use the formula 
|r<*+ l) = zT(z). 
5.20. Given a sample X = (X,, . . „ X n ), estimate the parameter $ 
of an exponential distribution with the density /(jr, 8) = 6c ~ ** , x > 0. 
Suppose that the loss function is L(8, *f) = (1/9 — d) 2 , and the a pri- 
ori distribution is„*(0) = r(_. X), X > 2. Prove that the Bayes estimate 
is of the form 

n 

«*09 = -tt— -tt (ay\x t + 1 ) . 

o(X + n — 1) \ *— * / 

and its risk is 

/■(£*> = (fl^fX + n - 1)(X ~ 1)(X - 2))"'. 
Show thai the optimal number of observations is 

> 



- X + 1 



Wc(X - l)fX - 2) 

when the price of one observation is c > 0. 

\HinL Use the solutions to Problems 5.13 (4) and 1.39 (2). 
5,21. Let X = {Xj , . , . , X„) be a sample from the distribution 
_?(G, 0), where the a priori distribution of the parameter 8 is Pareto's 
distribution with the parameters a and _ > 2 (see Problem 5.13 (5». 
Show that the Bayes estimate for $ is of the form 

S*(x) = - "*" a - max (a, jfit„))» *(ro = max x/, 

« + Of - l 1 <is;/i 
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and calculate its risk. Find the optimum sample size if the price of 
one observation is c > 0. 

\Hint. Use the solutions to Problems 5.16, 5.13 (5), and 1.35. 
5,22. Given an observation X, estimate the parameter $ of a uniform 
distribution R(0, (f), the parameter's a priori density being 
x(fl) = 0e~ ', 6 > 0. Prove that the Bayes estimate for a quadratic loss 
function has the form 

S*(X) - X + 1, 

and its risk is 

!■(«•> = I. 

Hint. Write the average a posteriori loss for the decision d in the 
form of an integral and differentiate it with respect to d. Use 

the formula T{n + 1) = j t"t~' dt = n! 
o 

5.23* Let the vector r = (pi, . . ■, yfd have a polynomial distribu- 
tion M(n: 9 = (9i , .... 0jv)). Given an observation on *, estimate 
if the loss function is 

L(B, d) = S (* - W*. d = <<*'< ■ ■ • d ">> 

under the assumption that the parameter 6 has an a priori Dirichlet's 
distribution D(a) (see Problem 5.13 (6)). Show that for v = h =» 
(A,, Ajv) the Bayes estimate **(h) ■= (5"(h), . . ., 4^(h» has the 

form 



Mh) = ; , 1=1,. 



and its risk is 



. , N, a = y j Ki, 

t- i 

2«? 



a(a + l)(or + «) 

///«/. Use the solutions to Problems 5.13 (6) and 1.52, and the 
formulas 

' a(a + l)...(a + r - 1) 
for the moments of the distribution D(a). 

■>— sgy 
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5,24. Given a sample X = (X t , , . , , X«) from the normal distribu- 
tion v¥(&, b\ construct a Bayes estimate for the parameter, which 
will minimize the standard error for the a priori distribution 
^(0) = ~4%p, a 1 ), Calculate the risk of the resultant estimate and de- 
fine the optimum number of observations if the price of one observa- 
tion is c > 0. 

\ffinl. Use the solutions to Problems 5.16 and 5.13 (7). 

5.25*. Estimate a scalar parameter if the loss function is 

W, d)=\e ~ d\, e, d€R l . 

(1) Prove that for X = x the Bayes solution d* = 5*(x) is the median 
of the a posteriori distribution ^(6\x) for any a priori distribution 
S(ff), i.e., it is a number such that 

P(fl < d*\x) > ~ , P{0 £ d*\x) £ j . 

(2) Use this result to estimate the mean in the model ^(0, b l ) when 

Hints. (1) Prove the inequality E()0 - d\{x) 3s E(|0 - d m \\x) 

Vd$R\ 

(2) Use the solution to Problem 5.13 (7). 
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Statistics of Stationary Sequences 



A sequence of random [ X, ) , ( = .... - 1, 0, 1 which is unlimited 

on both ends is said to be stationary if the conditions 

KX t - m = const, 

cov (X k + l , X,y m E(X k ♦ i - m){X, - m) = R t 

are met. 

A sequence of numbers («*], Ar = . . ., -1, 0, 1, . . ., is called the 
covariance function of the sequence (Ai). Here K-* = /?* for all k 
and i*o = DJf, = b" = const. We will assume that 

* ■■ i 

The statistics 

« « -* 

X = l y\x„ &W = — ^ T](Jf, - *)WT* ♦« - *), 
n *— • n - K *-^ 

ft = 0, 1, ..., n - 1, 

are used to estimate m and R* from the observations X>, . .., X H . 
As an example we simulate n terms of a stationary sequence 

Jf, = £,_, + f, + & +1 , r = 1, 2, .... n, (6.2) 

where &, / = 0, d:l, ±2, . . ., are independent random variables uni- 
formly distributed on the segment [0, 2fi]. Wc can easily show thai 
EX, = 3A, R = A/2, R t = A/3, /tt = h/6, R t = 0, ( S 3, Tables 6.1 
and 6.2 give the values of the statistics X and £*(«) for different *, 
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Table 6. 1 



X <?„(„) CUn) Ci(fl) C,(n) <5i(n) 



10 1.70 0.372 0,264 0.088 -0.128 -0.242 



1°0 1.41 0.270 185 0.092 0.026 0.047 



1000 1.50 0.162 0.178 0,089 0,002 -l.Ox I0" 4 



h = 0.5 (EX, = 1.5, R„ = 0.25. «, - 0.166. . . , A = 0.0833. . .). 

Table 6.2 

n ~X £?„(/,) Cut) CUn) £}(n) d{n) 



10 2.040 0-535 0-381 0.127 -0.184 -0.349 



100 1.690 0.389 0.266 0.132 0.039 0.068 



1000 1.800 0.378 0257 0.126 0.003 -2.0x10-" 



h = 0.6 (EJf, = 1.8, «, = 0.3. A = 0.2, R 2 = 0.1). 

The spectral density f(x) is an important characteristic of a station- 
ary sequence {X,\. ll is a Fourier transform of the covariance function 
I**], i.e., 

/(■*) = g- 2_j At cos kx, x€ |-ir, «■]. (6.3) 

t . -co 

The spectral density (if it exists) and the covariance function are in 
a one-to-one correspondence. The statistics of the form 

fn(x) = — 2-t "'<><*)£*<'») cos kx (6.4) 

l*| *J * _ l 
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are used to estimate f(x) from the observations X it . . ., X n , where 
iWnW) is a sequence of weighting coefficients (w„(-k) = w„(k)). 
Specifically, for w„(A) = 1 - (|£|/w) we obtain the periodogram of the 
sample. If the mean m - E,X t is known, then we replace X by m in 
(6.4). 



Problems 

6.1. Prove that the arithmetic mean X = (X, + . . . + X„)/n is an un- 
biased and consistent estimator for m - EX t . 
6.2. Prove that the statistic 

• -* 
&(*) = — l — AjtT. - m)<X* + , - hi), 0<*<«, 

/T — A ^"— ■ 
I" I 

is an unbiased estimator for /?*, 
6.3*. Prove that the statistic 

n - 1: 

Ct(») = — ^r __!(*; " ^><^* ♦* - S) 



is an asymptotically unbiased estimator for Rt as n -* oo, i.e., 
E<?*(n) — /?jt (A: is fixed). 

6.4. Let £ and ?; be random variables with E£ = Ei; = 0, 
Df = Dif = a 1 , cov (J, ij) a 0. Prove that the sequence X, = 
£ cos \t + rj sin \l, t — 0, ±1, ±2, . . ., X € (0, ir), is stationary and 
calculate its covariance function. 

6.5. Let £,, t = 0, ±1, ±2, . . ., be uncorrected random variables 
with m ~ Efc, a 2 = D&: Is the sequence (£,) stationary? 

Prove that the sequence 

r 

X,= 2 Oj{,.j, (~0, ±1, ±2, ..., 

is stationary. Find EX t and Rk. 

6.6*. Prove that for the sequence (6.2) the estimator C*(n) found 
in Problem 6.2 is consistent. 

6.7. Simulate a sequence of the form (6.2) for the case when & are 
normally distributed with E& = 0.5, D& =0.1, and n - 100. Compile 
the respective table similar to Table 6.1. 
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6.8*, Given the values X,, t= — n, —n+ 1, . . ,, —1, 0, 

predict the value of X t , i.c, find the optimal linear predictor 
o 

■*in = S PinXi, which means that $, must minimize the expression 

( - - n 

E I Jfi — 2J A-Vi ) ■ Calculate the minimal mean square error 

«*(«) = E£3fi - Jfjj 2 o/ lAe prediction. 

6.9. Let v ( , f = T ±1, ±2, ...,bea stationary Markov chain with 
states 1 and 2 f2]. 

Prove that the matrix \pij(.t)\\, where 

/></('> = P(e»*. i = ./>#- /L i',J= 1.2. 
is defined by the formula 

if 

IfluOH = | 1 " 0f ,° i I. < « < 1. 

II or 1 — at 

Find a stationary distribution for this chain. 

6.10. Make a program for simulating the sequence v,, t = 0, 1, .... 
n, defined in Problem 6.9. 

6.11. Let \v,\ be the stationary Markov chain defined in Problem 
6.9. Is the sequence |tj r ] with 

for v, = 2, 
for a, — 1 
stationary? Find Eij, and R*. 

6.12. Simulate a sequence jji, . . ,, jj lw , where tj, is defined as in 
Ihe previous problem and a - 1/3, Calculate the estimates for Eij r and 
K*. 

6.13. Let v, be the Markov chain from Problem 6.9, and let et(0, 
fo(0> t = 0, dbsli ±2, .... be independent stationary sequences with 
E£i<r) = and the covariance functions Rip, i = I, 2. We assume that 
*H - £*(')■ Is the sequence in,) stationary? Find Erjr and Rk. 

I Hint. Use the formula for a total expectation. 

6.14. Simulate a sequence i),, . . ., j), TO , where t; t is defined as in 
Problem 6.13, a m 1/3, and M/> are uniformly distributed on the seg- 
ment [-1, 1]. Calculate the estimates for Enj, and R*. 



.-[-! 
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6.15. Solve Problem 6.14 for ^"(fe(0) = ~''<°. ')■ 

6.16. Show that under the condition (6.1) the spectral density /(*) 
(see (6.3)) exists, is continuous, and defines the covariance function 
according to the formula 

R k = f /(*) cos kx dx. 

— * 

6.17. Calculate the spectral density for a stationary sequence of un- 
corrected random variables. 

6.18. Doe3 the spectral density of the sequence \X,} in Prob- 

lem 6.4 exist? Show that in this case J?* = j cos kxdF(,x) t where F(jc) 

— IT 

is a Step function with steps a 1 /!, F(~w) = 0, F(ir) = a* at the points 
±X. (F(*) is called the spectral function of the sequence [X, |.) 

6.19. Construct the following representation of the periodogram 
(see (6.4)) using the sample values of the sequence \X,\ (the mean 
m is known): 



/„(*) = £- R$(x), R 2 „(x) = A*(x) + BliK) t 



where 






6.20*. Show thai for the expectation of a periodogram with the 
known mean m = EA", the representation 

EfAx)= ] k n (x-y)f(y)dy 

— H 

i, where *„(*) = ~— sin (^\ /sin fe J is Fe/^r'j Awie/. 

Prove that E/,(jc) -*/(*), -*■<*< it as n-»<=. 

Remark, A periodogram is an asymptotically unbiased estimator for 
the spectral density even if m is unknown but the estimates are not 
consistent. Consistent estimates can be obtained if the weighting 
coefficients w a {k) in (6.4) are chosen properly. 



is true, 



Prove that J„{k) is an asymptotically unbiased and consis- 



6.21*. Let the mean m - ELY, be known and w„{k) 

sin ek 
ek 
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x + 1 
estimator for the quantity — I /(y) dy, 



X + s 
tent estimator for the quantity ^ I /(y) dy. < e < ir, 

X 6 [ — T + e, it — e]. 

Remark. The result is also true for an unknown m. 

6.22* Let m be known, and w»(*) = A - — ) A - — ) for 

(*| ^ /« and w„(A-) = for |*| > /„, Prove that if n, /„ -* °o, /„/i7 - 0, 
and 2^1*^*1 < °°, then/„(jc) is an asymptotically unbiased estimator 

for /(*). 

Remark. The resuit is also true for an unknown m. The estimates 
are consistent under wide conditions. 
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TO CHAPTER 1 

14. Take K, = 1 if U. £ /? and X„ = if U„ > p. where V„ is the sequence (].5> 

1.3. Divide the segment [0. 1] into /V parts Ai, As, . . . , Aw. where A! m 
10, Pil. Ai = Ipi * - . . + pi- 1. Pi + - . . + Pi), ' = 2, 3, .... N. Take 
^T t = I if £/* e Aj, / * 1, . . ■, /V. where U* is the sequence (1.5} The numbers 
Jfi, . . ., X n form the realization of n first trials of the polynomial scheme 
with the indicated probabilities of the outcomes. 

14. Assume that S> = 0, S, - $,- 1 + X n , n ^ 1, where X n = 1 if U« « 1/2 
and Xn = -i if Wi > 1/2. Here l/„ is the sequence (1.5). 

lj<k Suppose that X* - -o tn (1 - U»), where C/„ is the sequence (1.51. 

IjB, Let f 1( . . ., f„ be independent random variables having an exponential 
distribution with the parameter a. Then the random variaofe tj = ft + . . . + &n 
has Erlang's distribution with the .parametere (a, m). 

L9. Thte 

JC„ — 1 — i ■ — -< ■, 

VW12 

where Li, is the sequence (1.5> V«fe can obtain a good appAsdmation to the norma) 
distribution already for rV = 12. Take this value for the calculations. 

1.12. If ft, .... itt are independenf Bernoulli random variables with the 
parameter p (see Problem 1.1), then _^& + .., + &)- !«(*, pX 

1.13. By the De Moivre-Laplaoe theorem wt have for large n 

P fte^L^ £ < } = *(0 - *(-0 = »(/> - 1. q = I - A 

or 

'(IH^)- 2 **- 1 

Consequently, we must lake &, = i /— and t = * " ' ( — I b u ( , 



for 



the relation P I 



(IN")" 



p | ^ fi, | = -y to be true )~or y = 0.98 the quantile is 
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«o.» - 2.326. For the given experimental data the boundary is <5 w ■ O.0183, and 

_ — ._ a 0.0069. This is in good agreement with the theory. 
n 2 

1.14. According to our assumption, the appearance of a number no greater 
than 4 can be considered to be a success in the Bernoulli trials with the probability 
p = 1/2 of success. Therefore (see the solution to Problem 1.13), the boundary is 

&>.« = 0,0116, and the observed deviation of the success frequency is I 1 = 

\n 2] 

0.0089 < fij.sj. Consequendy, we have a good agreement between the experimental 
data and the theory. 

1.16. For targe n we have 

=. 2*(r) - 1, 






Pll* - ai,! ^ fi) = 2* ) 

In this case aa = 6, « = 3, n - 4096, and the righi-handjide of the approximate 

equation is equal to 0l998 for 8 = Vju7nito.»» a -v/375096 X 3.090 = 0.0836 

The observed value {* - 6| = 0.1389 is much greater than the boundary, ijt, we 
have observed a haidly probable event. 

1.17. In this case (see the previous solution) ai — 2, ta - 5/6, & = 
•JS/IS x 4096) X 3.090 = 0j044. The observed value pc - 2| = 0.003 lies within 
these boundaries, it, this characteristic gives a better agreement of the data with 
the theory. 

1.19. Usin g the resul ts of Problems 1.13 and 1.16, we obtain 6 = 
•Ji^Tnua.m = VI1.9167/J0O X 2.326 = 0.339. The observed deviation is 
|3f — «i| = [5.942 — 6] = 0.058, i.e., we have a good agreement of the data 
with the theory. 

1.24. Since F„(js,) = and ^(M^b)) = fl»(n, /*,). where po = Fixo). 

n 
using the De Moivre-La place theorem for n — oo, we have 

*te), 9o = 1 - Pe. 
\ vnpoqo / 

Whence 

P ( IftWr - Al < -^ "* * (~F=*\ ~ * (—F==\ 

\ Vn/ \-Jpoq<>/ \ -Jpoqo/ 



= »(f) 
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1.25. We have 

cov (F»(jri). /=■„<*!)) = -s cov (m„(Xi). M*i. Jfj) + M*i)> 

- \ [COV (^(x,), A„(Jf|, JCa)) + D^(jf.)]- 
n 

Here »M*i) = nF(x,W - F(x,)) (see the solution to Problem 1.24), and 

n 

cov (n„C*i), A.Cxj , x 2 )) = 2 cov <W> M- 
I, j-i 

Since the observations are independent, the indicators i>i and fy are independent 
for r ;•: /, and we obtain cov < w> fj) = 0. We also get 

cov (t,,, f/) = Emf; - EijiEf/ = P()j< ■ ti = 1) 
-P{,„ = l)P(f ( = 1) = -P(iji = DPCfi = 1) = -FUOCFCxi) - n*,)>. 
because (nj = f ; = 1 ) is an impossible event. From this we find 
cov (jmtJfj). A.(*i. *i)) = -aFOciKFi^H - FljCiii. 

By uniting these formulas, we get the desired result, 

1.26. Consider a complete group of N events Ei = lES'il, Kt = 

[Xi <$s£xi| Ej»-i - lx*r-2< I < *w-il, Ew= (J >*«-il. whose 

probabilities arep pjv. respectively. Then f, is obviously the number of 

the realizations of Ej in n independent and uniform trials, i = 1, . . . , N. Conse- 
quently, -A*) = M(n; />i, - . .. pw). We have 

-np,p2 = covfo. *i) = cov(^(jfi), ^(Xj) - <in(*i)) 
= cov (jihCxi), ^(xi)) - Dp,(x,>. 

Whence 

cov (Mxi>> jmixi)) =■ D|i»(Xi) - npipj = npt(l - Pi - pi) 
= nFfrM - F{xi)), 

which is equivalent to the result obtained in Problem 1.2S. 

1.27. The random variables Xf, i = 1 n. arc independent and dis- 
tributed as i* for any *. Therefore, 

n 

cov (A„ k , A„J = \ TjCOV (A-f, XJ) m i«W (&, f» 
n " 

t. /-i 

- i (Ef * + ' - E«*E£*) = i. W * , - ata,). 
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In particular. 



DX~BA nl sl( ffll _«a =^. 



n 



When investigating the central sampling moments, we may assume that 
ai = and hence ^ = a*. Tkking this into account, we have 

ES => EAn2 — E/J„i — fa — — = — ■ fi2- 

n n 

We know thai <S Z ) 2 = A* 2 - 2A^,A„i + At, and directly calculate 

n 



f-l /*y ' t.i 



fu * (n - 1)# 



ju + <n - 1>,4 2(n - 1) , w ■+ 3(H - l)fij 
_ + -— M » . 

rr n n 3 

Whence 



DS 2 = E(5 J ) 2 - (ES 1 ) 1 



which is equivalent tojhc given formula, Wc still consider that t*i = and 
get cov {X. S 1 ) - E.(XS*)- By writing 



,*-' 
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we obtain 



/.l /-I 

> = 1 

For the distribution *#(&, o 1 ) the moments are ai = p, fu ■• o 

a i n _ i 

w = 3ct j . Therefore, we have EAT - m DA" = — , ES 2 . 

2t " ~ 'V , cov (Z, S 2 ) = 0. 

1.28, Consider the r-dimensional vectors {, = (A7 1 . ■ ■ ., X,')- ** 1. ■ ■•> 
n. They are independent and similarly distributed with E«i) = a, D(fc) = 
(cov {X[\ }&% = S (a and L are defined in the statement of the problem). 
Using the Central Limit Theorem, we then have for n — «= 

s(~ <& + .--+£. - iwH - ^(0, £). 

it remains 10 recall that — (fi + ... + £. - "a) ■ vn<>in*, - <w„ 
vn 

A„t, - a*,). 

1.29, We can consider that m = Ef = (see the solution to Problem 1.27). 
We take «, = MSl - w) = U + *-. where S„ = ^(Ae - «). *. = 
-v/t*4 2 ,. Since we have assumed that -/Hi.) - . ^(0, m - aS (see the 
solution to Problem 1.25), it is sufficient to show that *» — 0. But 

P(N > £) «S -E|8»| = — E^ 2 , = — 04m = -^=- 0, 
e e e Evn 

which was to be proved. The asympiotics of the moments follows from Problem 
1.27. 

1.30, The events (X«t « *i, -**w < JfcS » nd (M*i) ^ r > («»<**) ? ^1 *f« 
equivalent. Therefore, we have F„(x, , Jti) = P(ji.(*i) 5e /-, M#i) > #■ Suppose 
that xj < xi. We consider the random variables ti = #*(*]), fj = py>(*:) - 
M*i). C3 = n - M*i), Then {see the solution to Problem 1.26) 

y%n t "Jr '■i) = M(n; fit, pi, P>), 
where Pl - je&aj, pj - Fin) - F(*i), ft * 1 =- Fix*)- From this we have 
P(M*i) £ f, jn(*&) ;* s) = £P(»i = «, n a y), 
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where the summation is performed over all m and /, which satisfy the condi- 
tions m^r, s&m+jKn. Since 

m\j\{n ~ m - j)\ 

we obtain the formula from the statement of the problem. IF xi > xi, we have 
( AV) ^ X\ , -V<,> ^ xi \ = !-V(jj ^ Jfj ] . The formula for the univariate distribu- 
tion function can be obtained, for example, from F,{xi) = lim F„Oti. -ft)- 

1.31. Suppose that r = 2 (the general case is treated in a similar way) and 
the points Jfi < x% are given. The event (AT ( t,)f (jci; x, + tfjtj), Jf«,) € 
(jti; jri + dxz) J occurs if and only if ki — ] of all the observations arc smaller 
than X\ , one observation is in the interval (xi, x t + dxj), *i — fci - I observa- 
tions are in the interval (x t + dx,, .Vj), one observation is in the interval 
Oh, X) + cte), and the other n — fc; observations are greater than xi + dxj. 
Since the observations are independent, the probability of this event for small 
dx, and dxt, up to the terms with a greater order of sroallness, is equal to 

C?'~V< "'(*,)(" - t, + l)Ax,)dx, 

We divide this by dxidxi and tend dxi and <^.v: to zero to obtain the desired 
formula for g*,t,(xi, -<s). 

1 .32. We write ti - Inptl, i a 1,2, and assume that >w = (2" wr - i>,)^. 
■' ■ 1, 2. By (1.2) the joint distribution density of the random variables ij„, 
and it,, is (see Problem l.3l) 

MJ-J, &) = -St. + !.*,♦ I ( J7., + ~, i„ t + ^) = Ai(n)Ai(n)A,{n), 
1 \ Vfl v«/ 

where 

^i(t) 



n'.pl'ipi -PiP' *'"*(! -P3)""* 3 "' 



n*j!(A:i - *i - !)!(« - Ai - ])! 



pi - p i 



<-i)r 



i -« 
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From Stirling's formula it follows that A i (n) -* — • Since 

2xV/i|(/?i - pi)(l - pii 
the distribution density /(jr) is continuous, we have Ai(ri) — /(i>,)/(J>t)- fi- 
nally, since 



we easily get 

In AdD - ~ (—r^ — zftM* - -^-AMflW 

2 \Pi(W - J»i) At - Pi 

(1 - Pi)(/Jj -Pi) / 2 J *- J 

taking into account that /({■p,)/<i> t Wpi(/Ja - Pi)(l - Pi) - (del |oy|)~ 1 '' 2 , 
we finally write 



2Wdet \aij\ 



fc £•*.}• 



i*., in the limit we have the density of a bivariale norma) distribution with 
zero means and 1 the matrix of the second moments \<ru\. 

1.33. According to the solution of Problem 1.31, the joint distribution den- 
sity of Xf^ and Xt»- t + o is (for x t < x z ) 

©■„»_,«. t<*>, xi) = - — „„ "' -r-'ixoiF&i) - «*»-*"" 

, </■ - l)I(s - l)!(n - s - r)< 

x Jl - F(x 2 )y 7(jc.)/(»). 

Since the Jacobian of the transformation yi = nFOct). yi = n[l - Ffrii] is 
J5£». *i> = ~« 2 /(*i)/(*i)i ov formula (l.r) the joint distribution density of 
the random variables * n = «F(Jf<r)) and ij„ « » [1 - F{X(„ - ,, o)] has the form 

M**l-^.(«^(S).l-'(«-?)) 

i\j(*-(d\ r-Y. ^M. "' *" *" 

/lV W Y "//I («-<--a)W**0—l»fr-l» 

\ « y (/ - 1)! (* - 1>! 

if n — ■ so and r, s are fixed. 
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Thus ** and ijn arc asymptotically independent and so are X^. and 
Jf"<„ -,,.„. We also have y(ar„) - r(l. /) and s\ v „) — T(l, J). 

1.34. The Jacobian of the transformational = nx\,yi = <n - 1H» - *i), 
. - , J* n = Jf» - Jtn - 1 is J(x,, , , ,, jo.) = «!. Whence, by using (1.2) 
and taking into account the hint, we find thai the joint distribution density of 

n 

Y, r„ is exp I -y, - . , . - y„\. Since *<*> = 2 Yj/(n ~j+ 1), we 

have 



E*w = X, ' r El}, (Wf«, = 2_. 

' •" n — j + 1 r * 






(n - J + l) 1 



DYj. 



The mean and variance of the exponential distribution r<l, 1) are equal to 
1, and we finally obtain 



= S j- «*»- S 7 



/••-in /.n-**i~ 

In particular, if n -» «, we have 






£*■(„) = / j-= lB/1 + f + o(l), 

/-I 

where c = 0.5772. . . is Euter's constant, and 



»*<■> = XI T 2 - ^ + "«> 



1.3S. The result of Problem 1.31 implies that the joint distribution density 
of the random variables X<r> and Xtn is 

(k - !)!(/ - k - t)!(/t - 0! 

< jr, <jr 2 £ 1. 

Then (see (1.2)), the joint distribution density for Y, = X^ and Yi = 
Jftrj — Jf(*) has the form 

<°Oi, j^) = — — — - — — ~ — -j4~'A~ m ~ 1 - y\ - yi)"~'. 

' * ' (k - 1)!(/ - k - !)!(/> - />' 

Jfi. J^ S 0, jv, + >>i < 1. 
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In order to find the distribution density for Yi it Is sufficient to evaluate the 
integral 

' 7' 

J * w ' w r U-k- WAn -l + kf." 
o 

o an < l- 

Similarly, the distribution density for Ji', k , is 



< j> ( < 1. 

a 

Since the mean and variance of the distribution B(a, b) are and 

a + b 

— , respectively, we have 

(a + bf(o + 6+1) 

Er w ._*_, d^, = *<" - * + " , 

n + 1 (n + l)*(n + 2) 

n + 1 (n + 1) (b + 2) 

Finally, since 

DW ( r, - Am) ■ DX ltt + B.Y«> - 2 cov (X m , X m ), 

we find that 

ft(n - / + 1) 



cov (_**<„, Jf<n) = 



(n + lj 2 (n + 2) 



1.36, Noting that J'i — J = K{0. 1), tie may use the solution 

to the previous problem. In our case A"o> = (b - a)X( t) + a, Jf(n> = 
(ft — a)Xi,i + a, where X ( ' si and X(„ } are the extrema of the sample of 
size n from the distribution R(0, 1), whose joint density is 

gi*{Xi, **) ■ n{n - IX*2 - Jfi)"" 1 , 0.<Jf, <»< I. 



n /T-ll\* 

F<Af<.> «*)=!- e~"^~ 5 ~<' , x £ o. 



137. P(A-(i, > Jf) = P(^f; > x, i = 1 n) = [1 - /?(*)) 

x 5= <r. From this we have 



10— 88'J 
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and 

Ka""lXi„ - a)/b S = 1 - e"'*, i > 0. 

Thus, the random variable n""(X w - o)/o has the distribution W(0, or, 1) 
which is independent of the size of the sample. We infer that 

E#o5 = o + bv( 1 +1)*-**, 



D*<i 



1.38. The terms in J\,(xi . x 3 ) are independent and distributed iike the ran- 
dom variable ,, = /(£j < *,)/(£. < .n). Therefore, 

ef„(*,, a) = Et, = p(, = i) = p(f, ^ *,, Ei < ^ = f {Xu ^ 

DF„(x,, ja) = _ Dt, ■ - (E, - (Etj) 1 ] = -F(X,, xj)(l - .F(x,, X2)). 
n n n 

By Chebyshev's inequality we have 

Pflfl,(Xi, **) - F(* lt jft)| > e>< _ DF.fo, *i) - 

as n - «o. We use X, = {Xy, .... X^) t j = 1, 2, X,, 5) = S^X;) to denote 

n 

the respective sample means and variances, and S,i — — / ,(Xn — X\) x 

(Xn — Xi) = — f*Xi\Xn. -* X\Xi to denote the sample covar Lance. Then the 
Statistical analogue for the correlation coefficient £ = cov (|i, feWD&Dfe 
will be e „ = Sn/StSx. If E(|f$|) < «, then there exists D f - ^jXnXii) ■ 

— DUifc) and it follows from Chebyshev's inequality that 
n 

n 

- / jXuXa ~ E(f L fe) 
/i — 

i- ] 

for n — «. Since ^ ■£ Efc, S J (X^) ■£ D&./ = 1, 2, we have a, * e if »& > 0, 
/ = 1, 2. 
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l.M. (1) If j"(t) m.tip, a\ then Ee ,ft = e 2 . 

(2) If A& = r(fl. X). then Ee" £ = ', 

(1 - iaf) 

(3) If -/fl, .... vn\ m M{ir, p , prd, then Efctf.. .*#> - 

f>]pi + ... + XnPhY- 

(4) If AO = n(X), 'hen Ejt* = e M " " *. 

(5) If y«> - fli(r, /?), then Ejc* - § . 

(L - jmc)' 

1.40. Suppose that V is an orthogonal matrix which reduces £ to a di- 
agonal form, i.c, U'EU = D. We write B = UD'". Then £ = BB', and 
Y = BX -1- n, where the components of the vector X are independent and nor- 
mal as '(0, I). We have Q = X'B'ABX a X' AiX and, as stated, 
Ai = B'ABB'AB = B'AB = Ai, ).e_, the matrix Ai is idempoient. Conse- 
quently (see assertion 2° from Sec. 1.6). A.Q) = x^tr A,). But 
tr Ai " tr (ABB') - tr (A£) - m, q.e.d. 

1,44. The joint distribution density of Xj. and Ai is 

m . & = /'' ^ ' e " "' + '■>". *,.*>». 
tf *. **. r (x l)r (Xj) 

Consider (he transformation yi - Xi + X2 t yi - Xi/(xi + xi). It uniquely 
maps the region \xt, Xi > 0| onto the region \y\ > 0, < yi < 1 1, and its 
Jacobi an isJ(xi, xi) = — l/tri + «). By formula (1.2) the joint distribution 
density of Vi and rj is 

<fi(y>. yz) = /Ccm. j^iO - yi))yi = 



1.45. The formula for the moments follows from that for the moments 
of the distribution r(a, A) for q — 2, \ — i/2, and the property 
rtA + I) = KT<\). Specifically, Evf = 1, Dxf = 2. By the reproducibility 
property of the gamma distribution we have xi — d + - - ■ + fci. where the 
terms are independent and equally distributed as I*{2, 1/2) = x l (l)- By 'he 
Central Limit Theorem as n — ™ the random variable (xJ — n)A/2n is asymp- 
totically normal as - / (0, 1). 

1.46. The first assertion directly follows from (1.4). In the second case 
we have 

P(o + tan J s£ jc) = P({ « aretan (x - a)} = - ( y +■ arctan (x - o) J . 

The sought-for distribution density is 

arctan (* - a) = ~ - 



& 


+ 


X^ v '^ 




"0- 


•ja^- 1 


r<x, 




B(X, 


, W 



x r i + (j- - ay 
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1.47. Since E/^ = n'E^ECxJ)", the general formulas for the moments 
of "the distributions „-^(0, I) and T(2, n/2) give 

Ev 1 ' = 1 X 3 x . . . x (2r - I), 



(H 



W a 



rr , ■ ; 2)C» " 4)...(/i -2J-) 



for 2r < n. The remaining assertions about the moments follow from the form 
of the density .^(jr). The assertion about the convergence of .£,(*> follows from 
the relations 

"-"- r (^ i )A(0--' (■*?)•"""—• 

Finally, by I he law of large numbers we have i^/n * I- Then VnTxf £ I and, 
consequently, _,-(r„) - .^,) = ,^(0, 1) (see the assertions I" and 2° (c) from 
Sec 1.4). 

1.48. Take Y = **/&& + X&. Then F,,. n = 5 A = 5 * . 

But (see Problem 1.44) S(.Y) = a / 2, ~J and, therefore, 

w,., < *) - p (V < -?£-) = b f-^- ; i. *) . 

\ rti + n,x/ \ai + itjx 2 2 J 

Ssince Y - , we obtain the relation Si 



"I y. T/' e moments can be calculated from EJ^,^ - 

I — 1 E(xi,)'E(K^)*' using the formula for the moments of the gamma 
distribution. The moments only exist for -- — < r < — and are equal to 
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"2 

Specifically, for n z > 2 we have Er„,.„, = -. and for m > 4 we have 

rti — 2 



- Z "' tn ' + "i ~ a) 
"'•"' " flife - 2)Hnt - 4)' 

1.49. By the theorem about the mean we have 

For o -» oo and fixed a, Stirling's formula gives — "* - *° ■ Consequently, 

I In [1 - B(*; a, it)] = In (1 - x) - - In b 
b a 

I , r(a + ft) 1, c° _1 . „ . 
+ - In — - + - In ■ - ■ -» In (1 - *). 

b r(ft) t rt» 

The second relation follows from the first one and Probl em 1. 4B. 

1.50. The distiibution af the random variable f„ = tj/VxI/n is symmetric 
(because the distributions of - 7 and t; coincide), therefore, 

VOi =S X*J = P(-|*l £ '- « 1*1) = 2PC. £ \x\) " I 
or P(r„ < W) =^ + ^P<tf £**). For *>° we get F'((n^jr) = 

-P'f/*^* 1 ), '■*-, *■(*) =* &Um(??% These relations also imply that 

P(/» > t/vrt) = [1 - Fid 1 *; 1, n)|/Z. This and Problem 1.49 give the required 
limiting relation. 



X, + M )j = 



f-(X l + ... + X,)\ = X*(2/). -*"(! «< 



1.SI. Since ^1 - (J^l + - . . + X,) 1 = x*(2/). -^I -t»»i + 



X J (2m), and the random variables are independent, the required 

assertion' follows from the definition of Snedecor's law. 

1.52. We write v(x,, .... xn)~ E{x' 1 - . .x'tf) = {x,p, + ... + x N p N )". 
Then 

EC*!'- ■•«?) = V(*». -■- **- ». ■--> !) 

= (*i/>i + . . . + XiP* * I - pi - .. . - P*)"- 
We .have 

«*,+...+*„ 
Ed-i)*, . . . (hj/Hn = —7 -r- *9(jr ( , ..., JCw)U,-...-i*-J. 

The direct calculation of the derivative leads to the sought- far formula. Finally, 

ev ■» 2 ^ n P' = nf/ ' 
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'•J V I- I J,/- I / 

U3. Wc write i»* = (c,, . , . , i»i), where *>/ = (vj — npj)/iftt, j = 
1. . .., *. It is sufficient to show that the characteristic function is 

Ee» ''*-exp j -il'Ettj, t = (f,, .,., fc), 

for any fixed t. It follows from the previous problem that 

Be*''" =e-«5'-»J] + £ «(e»/^ - 1> 1" , p = (p, j»). 

We take the logarithm of this expression and recall that In <1 + e) = e - 
— + 0{e 3 ), e — 0. Under the conditions of the problem we get 



In 



Ee"'" - -ivnl'p + n / ^;(e"/^ - I) 



— I'Etl + 

2 



q.e.d. 

Finally, wc get after some algebra 

|E*j = pi , - .j»(l - py - . , . - p k ) ?t for k <N. 

1.54. For any integer non-negative *-, , ,..,**, such that ki + 
lea = tt, we get 

H$ - *,. j « i «Jfi + .. . + f* - *) = ' (fe ° * J ' y . ° .- • ••" .^, 

P(£i + ... + £v = n) 



. . + 
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Since (i, .... fjv are independent, the numerator is equal to 

JU N 



n ■-***. -.-*n % 



\ m >n + ... + \ N . 
J - l I - I 
We know (see Problem 1.39 (4» thai -*"(£i * ... + fit) = n(X), and the 
denominator is e " **•/»!. The sought-for probability is - — — pf ■ . . ./$", 

#1 ! . . . Kfl'. 

which proves the assertion, 

1.5S. Compute the unconditional probabilities P(£ = *),* = 0, 1 

Wte have 






Ptf = k) = \ e 



->** 



*! r(r)o' 



■ r <* » i / g V*' i = c * / « V ' 

yt!P(r) \a + 1/ ff ' "*"' \a + 1/ (a + I)'' 

136. The vector [X, Xi - X, l . ., X* — X) ii distributed normally 
because it is a linear transformation of the normal vector X. Here 

cov (X, X; - ~X) = cov (X i A"j) - D* » - T>Xi - DX = — - — = 0, i = 

n n n 

1, . . ., it. Therefore, the first component is independent of all the others. 

1.57. Suppose that U is an orthogonal matrix which reduces Ai to a di- 
agonal form, ijc, U'AiU = Di, whae, as staled, »— n\ = ni diagonal ele- 
ments of D, are zero. We introduce the vector jj = V ' X. Then X = U? and 
wc may write 

Q = m' = ii'U'Ailiij + tj'U'AzUij = ij'Dn) ■+■ I'Dit, 

where Di = U'AjU = E„ - Di is a diagonal matrix and rank Di - 
rank A; = m. It follows that the diagonal elements of Th which correspond 
to the zero elements of Oi are equal to unity and all the other elements are 
zero. This means that all the non-zero elements of Di arc equal to unity. Conse- 
quently, the matrices A| and M ate idempotent. It follows from the above 
that D!Dj = and hence A1A2 = O. 

X, - p 

1.58. If we go to the normalized quantities X/= - — , the form of ■ 

a 

does not change, and we may consider that (.«, 0) = (0, 1). Let B be an n x n 
matrbt whose all elements are \/n. Then nS z = ^ (X, - X) 1 = X'AX, 

i- \ 

where the matrix A = En — B is idempotent, and hence rank A - 
lr A _ n - 1. Ii follows that n - 1 of the eigenvalues of A are unities and 
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one eigenvalue is zero. Suppose that a,, .,., u„_, are the eigenvectors of 

A, which correspond to the eigenvalue 1. Then A = 2 <"*"*' " a spectral 

*.i 

representation of A, and we can directly cheek that u, = / — - I ^LZ_1 

V tt - ' V " 

1 A / n __ »-i 

--. - ■ ■> -- I . /~^T <•*"' ~ *i = "i' x . and /is 1 = 2 («"*X)*- Thus, 

by writing Yt =■ u,;X. * = I ft - 1, we obtain 



V* - IS VK? + ... + K, 1 ., 

where r, K. - i are indep endent and ., t (0, 1)- normal, which can be writ- 
ten as i, - Y,/y/Y} + x i~2- Here vi_ 3 is independent of Y, and 
^"<X»-j> = x l (n - 2). Now, by direct calculations, we can find the j?- 
distribution. Note that this distribution is in the interval (- 1, 1) (since tj 1 < J> 
and is symmetric (since the distributions of — Yt and Y t coincide). For 
< u < 1 we will therefore have 

P ( , > „) = iptf > S) = l r ( **-« < *~**^ 

2 2 V(„_ 2 )r2 (/i-2)«V 

2 \(n-2)« J / 

where /^(jt; n s , bj) is the distribution function of Snedeeor's law S(ni, «i). Us- 
ing the rcsuit of Problem 1.48, we write 

\(« - 2)u I / V 2 V 

Wfe finally find for < u < 1 that 

For the negative values of « we have F,(u) = 1 - F,( -_u). 

1.59. (a) The population of random variables (JSTi, Ai, X it - X,, 
-*h - Xz, i = I, . . ., n) is distributed normally because it is a linear transfor- 
mation of the normal vector (Xti, Xn, i = 1, ,,.,»). Wfe can directly check 
that the firsttwo_componencs Xiatitl JCi are uncorrelated with the others. This 

means that (X,, Xz) and [X„ - X,,Xa - Xi. i = I n) are independcnl 

and hence (Xi, Xi) and {S}. Sn, S& are also independent. 

(b) It follows from the above that y%X, ,X 2 ) ~ >r ( (jt, , mi, — Zj . By 

assuming in Problem 1.40 that A = flE~ ', we will obtain the required result. 
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(c> We have S 2 = - (Y 2 + r,), S,i = — Y.-/Y,, Sl = — Y lt and the 
n n n 

absolute value of the transformation's Jacobian is 

\J] — ■ =- 

By (1.2) the density of the joint distribution of (Vi, Y 2 , Y,) is 

«*7i. «. J*> - ~ I*" 11 ** «~** 2 • ~ 2 /WKT<ii - 2), 
V2ir 

i.c, it is the product of the densities of the distributions . 7(0, I), x*i" - 2). 
and x\n - 1) (the coefficient is reduced to the required form by the formula 

T{p)r(p + — J I 2 "' ' = Vrr(2/>n . Thus, the random variable* Yi, Yi, and 

n are independent and S\Yi) * -■/'(0, I), -/tVi) - xV - 2), ^fj) = 
K a (n - 1). Whence 

^(T = YiMYi/{n - 2» - S(» - 2). 

Since e* = — . by using (1.3), we finally find that the distribution 

y/n - 2 + T l 
density of the random variable q* is 



M 



-1 < y < 



1.60. It follows from the solution of Problem 1.29 that the asymptotic 
distributions of the statistics S 1 and A„i are similar, and hence (see Problem 
1.28) the joint distribution of Jf and S 2 is asymptotically normal. The distribu- 
tion of any of their linear combination is also asymptotically normal, and 
therefore it is sufficient to compute the mean and variance of the difference 

X — S 2 . Using the solution of Problem 1.27, we find 



'('-A') 
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i n\ n - I / /r /i-l 

whence we conclude that ^(r*) - .-f(0. I). But 7% - J-„X/\ and. taking into 
account the hint, we arrive at (he required result. 

1.62. Her* -/"«,> = . /■(,.,, „f) and AW = -=L- exp f - (X ~ *"H . 

V2^o, f 2<r? 1 

Ws have 

/(,£:(*, >0 = 

2n 1 ffic72V! — $* 

x exp f ! f fr-*.) 1 - ., fr-wKr-w) 0>-^ ] ^ 

I 2{l - =) L a} 6 <r.« e| Jj 

Simple calculations lead to the required expression for the conditional density 
/t.li.O'U). 

1E"> I) 
_ a) fl be the matrix of the transformation. Then 

_*-<¥»". Y«>> = ^-/l f * J . LEL' J and direct calculations give 

— |Sui»|££Hr-.j'|.-|?':| 

(in particular, this leads to the equation |£| - JE,,| x |Bj). The structure of 
the matrix LEL ' implies that Y*" and \ a} are uncorrelated and, consequently, 
independent. Besides we find that jTY 3 ') = ^P(pi - Ait w , B). Then 

_^ {x <i>| X (.i = „«) = _ A y a) + Ax < t>) „ ^ iM{x w h B) 
1.64. By the formula for the total probability, the probability of the event is 

P« - *> = 1 P' ( - J>j In £/, ^ A p(-in U t + , > X - at 






t*~ l e-' \* 



(* " l>! itt 
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1.65. Since p(cl/i £ /<*)> = fixyc and 



I 



c(b - a) 



the distribution density of the random variable £ at the point x <i [a, fr] is calcu- 
lated by the formula 
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2.1. The problems are solved by calculating the means and variances of the 
indicated statistics and require the asymptotic analysis of these characteristics 
as n - «>- For example, in problem (a) we have F R (x) = iu(x)/n, where M*> 
is the number of units in the sample X = <A"i, .-, X„\ which assumed the 
values <x, U, -*r*(M*» = Bi{rt. F{x)}, whence we have for n - w 
E/\.(*> * E*,(X>/» = Fix), 
Df,M = DM*)/" 1 = F(*W - F(x))/n - 

which means that F*{x) is an unbiased and consistent estimator For F(X>- 
Consider problem (d). Since (see Problem 1.27) 



DS 



Es .,^i„=,„,. (l) 



for *u < «J. 



S* is a consistent and biased estimator for M . To eliminate the bias, we must 

use the statistic — ?— S 2 = S' 2 whose variance is ( ^"rj ds1 = ° (~ ) ■ 

i.c, S' 1 is also a consistent estimator for in. 
2.2, The solution of Problem 2.1 <b) gives 

TUX) -^ Jou/2 = «, » VJ5 «• a,. 

i*e. ( the standard deviation and the mean are equal. Specifically, this is true 
for the distributions r(o, 1) and ^(a, «*), a > 0. 
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I. 4, For the random variable v = d + h the sampling data are 

(Yi = Xii + X a , i = 1 n), and its variance is Dij = D£| ■+■ Dfc + 

2 cov (fi, fj). Problem 2.1 (d) implies that the statistic 



tt n 



^o 1 



is an unbiased estimator for D,. The first two suras of (his expansion are unbi- 
ased estimators for the variances Dfi and Dfc, respectively. Therefore, 

b—^~ y_i (Xlt - x >H Xl1 - x *> = cav «'. &>• 

i. i 
q*d. 

2.5. For an arbitrary siatistic 7"(X) we have 

E,T(X) = 2 7*(x)*" : (I - »"" E "' 

* - <Jfl. ■ , X r ) 

X.-U.I. 

: - 1, , j. 

The right-hand side of this expression is a polynomial in $ of degree ^ n. Con- 
sequently, the unbiased estimators in this model can only be constructed for 

J 

the polynomials ?{S) = 2 "*<?* for s ^ "- 

2.6. Compute the first two moments of the statistic 7: Since -#(/■„) = 

Bi<n. 6), we have E,T = — L_ (E,r„ + a) = ^- + . g and 
* + fi n +fi 

Eft 1 = i — - <D,r„ + (£»/■„)* + 2«E,r. + a*) 

(n + ffl 1 

_ n(n - l)$ z + (2a + Qnfl + a 1 

(/I + ffl 2 

We find 

A<«, fi; ^) a E»{r - fl) a = E< 7° - 20Es T -Hr 1 

= g J (g ; - n) + 6{n - 20(8) f a 1 

(n + & 
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is independent of $. 

r„ + Vrt/2 „ 

Consider the estimator T' = — — . Its mean square error is smaller 

n + vn 

r» . 1 9(1 - &) r 

than thai of the unbiased estimator V = — . t*.. — — -r < — - — lor 

« 4(vn + 1) " 

g € [A ± "Y 2 ™ + ' ) . The length of this interval lends to zero as n - m. 

\2 2(v7J + 1)/ 
For the other values of the parameter » « (0, I) the estimator T* is more exact. 
Thus, the estimators 7"' and T* cannot be compared by the mean -square-error 
rest, and we must have other reasons to choose one of them. For example, 
we may consider that the estimator with a smaller maximum value of the mean 

square error is better {the minimax principle'). Since max 0(1 - 9) = - and 

1 < -L the minimax principle implies that 7" is better than r». 

4(vn + I) 1 *n 

2.7. The unbiasedness condition E, T(X) = t„(9) vfl 6 (0, 1) has here the 
fonn 

S Tific&ix - «) w = e 'c - V * e e <°- » 

i-o 
For any T the left-hand side of this identity is a polynomial in 3 of degree 
=£*. Consequently, the identity is only true for r + s« *. Now we can directly 
check whether the statistic in the statement of the problem meets the unbiased- 
ness condition and then show that this unbiased estimator is unique. We sup- 
pose that T'iX) is another unbiased estimator. Then the statistic 
T\(X\ - T{X) ~ T'(X) satisfies the identity 



£ 7"i 0)0^(1 - S)*- J - 0. < 9 < f. 






* it 
or 2 TiUyCix 1 * 0, < x < <■=, where x = -. But since the poly- 

nomial is identically zero, all its coefficients are zero, i.e., 7"'(/> = T(J) for 

j - 0, I. .... *. 

2.8, Due to the reproducibility of the binomial distribution we have 
-,4(7") = Bilkn, 9) and, therefore, 

e,//(7> = £ mficivfo - ef-J. 

For any function H this mean is a polynomial in 6 of degree < kn. Consequent- 
ly, the unbiased estimators of the form H(T) can only be constructed in this 
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model for the functions of ihe form t<*) = 2 *•*" for * <« *"■ Suppose that 

tj(8) = ^,y ^ *■«. Since E)(T)y = {kn)j& (see Problem 1, S3), the unbiased esti- 
mator for tj{B) is the statistic ij ■ (7)//<in)j. Using the technique from the 
previous problem, we can show that this is a unique unbiased estimator which 
depends on T. 

2.9. The first assertion follows from the chain of equations 

ir -a »«j 

lb prove the second assertion, we write the unbiased ness condition 
E, T(X) = t(S) vfl > in the form 



Sto^-Stt v * >0 - 



It is clear that there is no function 7*(A) independent of 0, which satisfies this 
identity. 

2.10. Here the unbiasedness condition E, T(X) = t(&) Vfl > has the form 

2 r W^ = <=*0 ~e-V = c» + e-'-2 = 2yi-£!l v0 > 0. 
■'-- , *! ^— J (2rtt 

*-* r-l 

The only function 7"<fc) which satisfies this identity has the form 
Tiki = I for eVen * ** "■ 

\o otherwise. 

This unbiased estimator for T(X) is practically useless. 

2.11. Here the unbiasedness condition 



can be written as 



23 W#.»-i«*<l - 9) r = e' ye e (0, i) 



Since the two power scries are identically equal, their respective coefficients 
are equal, and 

m>- <?;!-,- i/c?;i^i, * = o, i, .... 
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is the only function which meets this identity. We conclude that the only unbi- 
ased estimator in this problem is the statistic 

T{X) m (X)J(X + r - 1),. 

U r = 1, this statistic oniy assumes the values and I which do not belong 
to the parameter set 9 = (0, 1) of the model, and therefore il is practically 
useless. „ 

2.13. Since -4(JO = -*?#» " I/n >. we nave Est**) = °> x + (EtAX = 
(irVn) + J 1 , whence follows the required assertion. 

2.14. The second and fourth central moments of the distribution 

~*'(n. fl 1 ) ai* mi = 8 2 a nd <" = — '~i 9 * - 3e *' respectively. Therefore (see the 

solution to Problem 2.1), we have 
t 



Then 



n n z 

n n n 

D.fS' 1 ) = ( — I-Yd.CS 1 ) = -X- 0\ 
\n - ly n - 1 

Thus, 

E»(S J - e 2 ) 1 < D,t* < D.CS' 1 ), 

i*., by the minimal mean-squarererror test the statistic S 2 is h more exact esti- 
mator for the theoretical variance S 2 compared to the statistic t*, but in the 
class of unbiased estimators t* is more exact than S' 2 . 

2.15. Since -$( '~ ** J - -*W, 1), we have 

D*(X - H = E.(AT, - it) 1 - (E«|A-, - *.|) 2 - ^T-**- 



180 Answers and Solutions 

»7" n (X) = I— — / j Eal-Vt — ji| = proves the unbiasedness and 



Then E, 



D,r„(X) = — — D t \X, - A = fl 1 = O ( - I proves the consistency. 

2 « 2ji 

2.1*. The formula _^(7*/ff : ) = x'(") = r'<2. rc/2) and the formula for 
the moments of the gamma distribution imply that 



E,7* = (*2* /l r 



(^)/"# 



ii, the indicated estimator is unbiased. 

In order to compare the estimator r' for with the estimator T a from the 
previous problem, we must calculate Dc" ■- Y^sij^) 1 - J - We have 






2r* 




taking into account that T(jr + I) « rrw Consequently, D»ti - 

r, L, 

This expression must be compared with 

.J ' 

OtT„ = — - 1 (seethe solution to Problem 2.15). For n = 1 both statistics 

2rt 

(and their variances) coincide. For »i2w have D«rJ = I 1 IS 1 = 

0.2739* and D,7"i - {— I 6 2 = OOftS* 1 , i.t, the estimator t', is more ex- 

act than Ti (here we have taken into account that r(l/2) ■= i/t, T(l) = 1). Simi- 
larly, for n = 3 we have D.rJ = (— - 1 J S 1 = O.I780 1 < D*T, = 0.1900*. 

The result obtained in Problem 2.64 shows that the estimator t' is more exact 
than T„ for any n. 

2.18. The mean square error of an arbitrary estimator T\ is t 

Ei(71> - 6\Y = E,(X(S ,:1 - e}) + (X - I)?!) 1 

= x'n.ts-*) + o - u'sj = ».(X)eJ 

(see the solution to Problem 2.14, where D»(S /I ) is calculated), and we have 
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_2_ 

n-1 
_2_ 

n + 1 



n-3 
n+1 

Fig. 7 



n + 1 



*>{X) = 

w - 3 

n + 1 



2X J 
« - 1 



+ (X — I) 1 . The function v0>> is plotted in Fig. 7. Since for 



< X < 1 we have <o(X) < ^(1), the Inequality 

E,(7k - e»f) 2 < E,(S' 2 - «J)* 

n-3 n - 



holds for these X. We define it from the condition 



n+1 n + k 



< 1, which 



only holds for k o= 0, I, S, 3. Since min E ( f7V - 8|) 3 = *>(X*)0* = — — - 6\, 

y n+1 

we conclude that the estimator 7**. = / j(Aj - 5V is optimal with 

n + I *— ' 

respect Co the minimal mean-square-error test. 

2.19, From the solution to Problem 1.45 we have 

E,7\ = \Bl, EtTt = X 1 ?-tl Si. 
n — 1 

(n - l) 1 (» - I) 1 

We find the measure 

MS) - B»<7* - a!)" = e,(75 - iTM + tztej - 47\ef + aj) = *rx>a!. 

where 



* W - >* (" + !)(» + 3)<» + 5) _ ^ , (h + 1)0. + 3> | ft ,j 



(» - «' 



(n - I) 1 



n - 1 



- 4X + 



162 Answers and Solutions 






Using the formula X = _ ■— [l+-],we reduce the equation ^ ' (X) = 
to the form 

*' + 3p„x + 2q* = 0, p. * J^L, <j fl = -. 8 " J 



n + 3 (n + l)(n + 3> 

The discriminant D„ = pi + ql of this equation is positive and has the only 
real root 

x„ = V-<?„ -1- Sd„ + V-<7„ - sfD„ 
{Cardan's formuta) t which has an asymptotic representation x n = — + 



°a) 



for large n. 



We thus find the estimator 7* = - ( 1 + — )S'* which 

n + 5 \ n/ 

the measure 5 t (P) in the class of statistics |7\ = XS' 1 ). 
We now turn to the second measure 

Mfl) = E*|7\ - 0|| - x (X)e|, 

where 

X 



minimizes 



X(X> - E. 



Jff~i - 1 ■ SW-,)~X*tn ~ 1>. 



n - I 

If A»-i(/) is the density of the distribution x 2 (." - th then 

n- i 

o i ^ " ^ 

BO 

+ J (Trr'- 1 )*"-'"*- 



whence follows that the equation x'(X) = is equivalent to 



n- t 
"x 



V (*■„.,(/)£//= I tk„.,(t)dt, 



* - I 



which defines the unique value \- » XjJ and the optimum estimator 7\. 
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2.20. Since 
for the moments of the gamma distribution we have 



JiCrtS 1 /?!) = x*(" - 1) = r ( 2, J , from the formula 

s of the gamma distribution we have 

which means that the estimator is unbiased. For n = 2 we have 
$ = <iy2)jJTi - Xt\ and, therefore. 



-]Ai - Xi\ k , 



2*r 

whence 



W) 



B.|Jfi - Aij - fc = 2 jS t . 

VT 

2.21. The assertion immediately follows from the fact that yWH = 
r(0, Wi and from the formula for the moments of the gamma distribution. 

2.22. If vi'O = T(8, I), then v«(f/ff) = r(I, I) and, according to Prob- 
lem 1,34, the random variables Y r = (X^ — X v - ,y), r = 1, . . ., n, 

u 

are independent, and we have S?{ IV) = rtl, 1> for any r. Then (see the solution 

to Problem 1.34) X ltt = 6 2 JjA" — / + 1) and, therefore, 
J- i 

T = T(X) = y~jX*A- w = « 7j ^ V,, A, = X <>*> 

-*■ — * -*— * n - i + I ■* — ' 

it . i i - 1 l - 1 

i = l, . ... ' f. 

From this representation we immediately obtain 

r r 

E«r =t 2 Win - i + 1), D,T = r? 1 2 A ?A» - ' + l) 2 . 
i- 1 i-i 

The unbiasedness condition is equivalent to 2j A/A" — ' + ') ■ 1, for 

la 1 

r 

which wc have to minimize the expression Jj A?/(n — i + l) 2 . By applying 
the method of Lagrange multipliers, we find that the optimum choice of Aj 
will be A? = , i = 1, . . . , r. We conclude that the optimum unbiased 



il« 
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estimator for 6 has the form 



t r r 

t-l (-1 i.l 

and its variance is D»r* = 8 2 /r. 

1.23. It follows from the solution to Problem 1.36 (hat 

E.T = <*E,A-,„, + 0E,X w = (a^-1 + ^iL±i ) 0. 
\ n + I n + \/ 

D,T = a l DtJCon + 0*D«A<i, + 2 ai 8 cov (A - ^,, Jf (1) ) 

(« + l> l (n + 2) \ n J 

This implies that the optimum values of a and 6 must minimise the form 
a 1 + ft 1 + 2a^/'fi under the condition <i{2n + l)/(n + 1) + g(/> + 2)/ 
<n + I) = 1. The solution of this extremal problem (for example, using the 

Lagrange multipliers) has the form of - ^JLi2 s* = - + i 

5*1 + 4 5/7 + 4 

Thus, T* = - ■ . (X m + IX '<„,) is an optimum unbiased estimator for in 
5/1 + 4 

the class under consideration, and its variance is D, 7* = 



(n + Tiiin + 4) 
2,14. The unbiasedness of these estimators directly follows from Prob- 

lem 1,36. We have O f Ti = — < D»r 2 = —?— B 1 , and therefore thecs- 

rtifi + 2) n + 2 

tima lor r, is more exact. Moreover, D« 7\ -» as /i — w, i.e., Fj is a consistent 

estimator, while 7i does not have this property. Indeed, since 

P«t*XD St)=l-ll--), < < i£ 0, 

we have 

P.(|r, - *| < C ) - p. (l^JL * jr (l> ^ lifA 

\n + 1 n + 1 J 

- (\ - _lni_V _ |l l±i_ V - e~<* * ' y9 - e - 1 " * (V * < I 

\ «(n + 1),/ \ 0(/t + 1)/ 
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2.25. The formulas from Problem I.J6 directly show ihai ihe estimators 
are unbiased and give their variances, viz., 

I (01 - fii) 2 

DfTi =-(0,^(1) + D#AT ( „) + 2cav<J( m , X^)) - 



4 X ' '"" ""* 2(n + l)(« + 2)' 

2y«i - 0,f 



(^) 



U,n = ( — ~ ) (D,X W + D.A<„, - 2 cov (A,,,, *<,,,)> 



(n - l)(n + 2) 



As n — co the variances lend to zero, which means that both estimators are 
consistent. 

2.26. The results of Problem 1.37 directly show that the estimators are 
unbiased. Since D S T -» as n -* to, they are also consistent. 

2.27. In this case the theoretical mean coincides with 6, and the solution 
to Problem 2,1 (b) gives the required result. 

2.28. By the properly of the arithmetic mean of Cauchy's distribution we 
have Ji(X) = C(fl), IjC, ihe distribution of the statistic X is independent of 
n t and therefore the quantity Pei\X - fl| 5 e) is the same for at! n. 

2.29. H follows from Problem 1.52 that as n -» « 

E»7V ■ p„ D.T, = -L [E*M» - I) + Es iv - (E.iv)*! - — — -• 

and this proves the assertion (a). We then have 

For any function /* the right-hand side of this is a polynomial in pi, .... 
pnot degree ^ n. Consequently, the unbiased estimators can only be construct- 
ed for the polynomials of degree :£« in the parameters p,> . . ., p>*. 

Finally, if H - - /.em, then 
n ^~^ 

E,H = 2j C,J! " = Hffi ' D * H = | ( Zj c '^ J " tI(S) ) *"* ° 

; - i p - J 

as n — «=, i.c, // is an unbiased and consistent estimator for t{8). 

1 JO. We have a, = (*,(*> b fl,r(ei + l)/r(«i) = ffife, ai = m{0) = 
tffrtfi + 2)/r(«i) = S,fc(fc + Wi and«i = (•» - «i)/ai.Bi = otfAai - «i>- 
The sought-for estimates have the form 

B, = (4« - Ai,)/A nl m SV7, 

ft = A\/(A.i - A^) = X^/S 2 . 
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These Statistics are continuous functions of the sampling moments, and there- 
fore they are consistent estimates for the respective parameters. 

231. Here ct,{6) = E»e = - (ff, + 6H, 02(6) - E»f 1 ■ 

- (0? -t- 6\ + 8, + $£, and the estimates 



81,1 - A a , =F <JA„i - A^ - A„i = X =F VS 1 - >lf 

are the solutions 10 the equations a*(fi) = *■!„*, * = 1, 2. For the indicated 

data we have 8, = 2,17. . ., 8 2 = 3.3? 

2.33. If D ^ Ao, then the unbiasedness conditions are equivalent to the 
System 

D 

2 n*>/(*: A /*) = T(D). D = 0, 1 #0. 

This is a triangular system with the diagonal coefficients AD; D, n) = 
C%z%/C% yt and the values r(0) F 7"(I), ...,7"(*o) are uniquely defined from 
it. For the values m > k we stipulate that 

T{m) - I r(m) - 2 *"<*)/(*; m, n) I / J 1 - £ /(*! m > *) I . 



*n 



which can be done because for m > k a we have I - £ f(k; m, «) ;* 0. 

*-© 
Since 

(see the formula Tor the mean of the hypergeometrlc distribution H(D, N, n) 
in Chap. I), for the case of t(D\ = D the function T has the form 
T(k) = kN/n for k = 0. 1 * , and for m > k a we have 

Tim) - [ m - ^ 2 W»! *. ">1 / ( ' " £ A*! W- ")1 ■ 

Specifically, if we put k = n (the inspection of the entire batch Is not 

JV 
carried out), then the statistic T($) = — £ is an unbiased estimator for the num- 

n 
ber JD of the defective items. 

1.34. (a) Since Ev(u) = P(o fj) = jt(u> (see the hint), the representation 
e(j, X) = 2 -,(")*(«)/*■ (») implies that the Horvitz-Thompson estimator is 
unbiased. Since 2 a(u)*{u) = 2 ><")«(")*(«)■ '"« unbiasedness condi- 
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tion for the estimator implies thai 

u i 

Specifically, if we take the coordinate vectors of the Euclidean space /("as 

x, we will find that x(ui)a{u>) ~ 1, 1 = 1 N. Thus, the Horvitz-Thompson 

estimator is the only linear unbiased estimator for 7"(X). 
(b) Since 

iryM. 

and 

E7(u)7(u) = P(u a, u fe s) = *(«, u), u * ", 

we have 

De(i, x) - Ee l (i, x) - (Ee<A «» 2 - ), -~^r *(">*<") 

If i* w 

II u 

which is equivalent to the required formula. 

<c) The unbiasedness follows from the representation 

*— ,yK am \*m } 

YmuW * 1 "^ ( T{ "' * -'V 



+ 



(d) The formulas directly follow from 

1.35. <a) For any fixed unit u there is n{N - 1)„- 1 different samples which 
contain this element. Therefore, 

* (u) — wr--*- 

(b) The formula for DJf follows from the general formula for Defcr, x) 
(see Problem 2.34 (b)> if we take into account that 



Answers and Solutions 



n<n - l)(N - 2)„, a n(n - 1) 

"xiMt v) = — * -— * ■ — =f >— — , u ^ u. 

(«)» JV</V - 1) 

(c) The unbiasedness of &(r t x) follows from 



"" n (^"ri) ^jT(*h(»X*M - p)M«) - *<)■ 



2.36. We first calculate En,. We use the him <o obtain E*, = WEf^ = 
JVP<^' ) = 1), where, according to the classical definition of probability, we 
have 



Finally, 



E -~«(s)'H)""' 

in of any linear statistic has the 

-*2«®'(-s)r- 



We find that the mean of any linear statistic has the form 
Ef 



i*., it is a polynomial in I//Vof degree <n - 1. This means that in the class 
-if the unbiased estimators can only exist for the parametric functions of the 

form r<W) = 2 W/W for * ^ n - I. Let t(JV) be an arbitrary function of 

J-i 
this kind. Then the unbiasedness condition implies that 



SwfcK-sr-s 



cj/W* 1 YN^m. 



It follows that the coefficients /, of the sought-for estimator are uniquely de- 
fined through the coefficients cj. By taking into account thai 



5><*)'Hr=<*(# 



j = i, 2. 



we can directly check that the coefficients I, hare the required form [see 
Problem 1.52(b)). 
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2.37. Wr first find the distribution of the random variable ij. We use Ai 
to denote the event consisting in that the /th element (/ = 1, . . ., W) has not 
been observed, and suppose that min, m, N) = N - v is the total number 
of the elements which have not been observed. Then, by the formula for the 
sum of probabilities [2(, we will have 



P(Mn. m. N) > 0) = P ( U At) 



J-l [<*,<,,<#/<« 

In the internal sum all (he terms are (C3.j)V(Cw)'', and their number is CYv. 
Therefore, 

a 
F(«(«. m, M) > 0) = 2 (-iK* '€&(£#- j^AC©". 

Then 

P(r, = ft> = P( W (n, m, N> = TV - k) 

2 Vi A '< ■ - - ' 4 '»-.^i - - • a jJ> 

I < fr < . . (Jiv^dV 

where [/,, ...,/tJ - fl, .... N|N (« /«-*!• In the latter sum all the 

terms are equal, and their number is C%. By the theorem an the product of 
probabilities, we will have 

t(A,,...A( K _,~Aj l .. .A Jt ) = P(>ij,. ->ii tf .,)P(S}, -•'A&\Ai t .,.At li „J. 

Here 

and 

P{Tj, . . .4Ll/*i, • ..A»J = F(M". «■ *> = 0) 

= 2 (-i/ci(C?.j)'/(cn". 

These relations give 

p<„ = k) = c& 2 (-i>*" J 'c{(C7>vccf;)", 

Jt — m. m + 1, .... min (m, n, rV). 
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Now i« N ^ mn. Then 



Et* = 2 i»C*)Fh ■=*) =(Q-'i;ct£ (-1)* "ti&ZffJjh 

because 

A - J (I for > : = N. 

If at > mn, then by taking into account the properties of the function AN). 
we may write 

Er* »(C!»"" 2 eft 2 t-D'-^ci/o-). 

t 
it Is sufficient to show that 2 (- 0* ~ J Cl/"C) = for Jt > mn, which fol- 

lows from the chain of equations 



*-r 



2 (-U^'CUA = («r 2 (-)>*-'-'«_, = 0, r < fc 



J-0 



We now have 

NL-J 



Er* = (CJD- 2 /WCt 2 (-D'Cv., = (CS>— AAO = rfA). 



J«i 



2.41. Since in repeated independent trials the vectors X and irX are dis- 
tributed in the same way, we have E^r* = E t T = t, i*., T* is an unbiased 
estimator for t(B). Now let D,T(X) - 6 1 , then D t T( r X) = S 2 and, according 
to the Cauchy-Schwarz inequality, we have 

cov, (r<r,XK TlnX)) < VD.r(iriX)D,7-(^X) = S 2 . 

We then find 

D ' r * = (^ |Zj D,r<,rX)+ S ■"•flftnDb n^x»] 

■ ri it « -I 

( 6 , Hi + nt(„l - I) _ gl 
in\) 2 
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We see that for any unbiased estimator we can find a symmetric one whose 
variance is not greater than the variance of the original estimate. Consequently, 
the optimum estimator (if it exists) should be sought among the symmetric 
functions of observations. 

2.42. (I) Consider the statistic T* = T* + X& which for any X is an unbi- 
ased estimator for t. Then, since T* is optimal, we have 

D«n = D,7+ + X 2 D,^ + 2X cov, (r*. \fr) £ D.7». 

But this is possible for all X if only cov, (7*. $} - Vfl. 

(2> Let T be an arbiltary unbiased estimator for t. Then the statistic 
ij, = r> — T has a zero mean, and we find from the above that 

= cov, (T*, T* - T> = D, 7" - cov, (T* 7). 

I TT - 

1.43. In the model . ?t$, a 1 ) we have /(jr. 0) = -=r- e . Then 

s/2ir<7 

a 1 i 

- — In fOc, 6) = — . Therefore, 

w - -4- 

a 1 

d 1 i(.x — b) 1 I 
In the model , / (p, 1 ) we have - ~ ]n/(pr. S) -= -j, whence 

In the model r(S, X) we have /(jt, S) = . and - — - In fix-, *) = 

P{X)fl op 

2* X 
— , whence 

a* e 2 

* 2/je — 8"l 

In Cauchy's model — ln/(jr; fl) = r. Therefore, 

ae ^ l + (jc - $y 



m 



* J U + 0* - 8)*] 3 2t 2 



In the binomial model wc have Ax-, fl) = CJfl J (l - *)*"' and 
- — ln/<;r; «) - 4 + — T- Consequently, 

a? 1 e 3 o - « l 
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m = -^E**i + ~ (* - E»JT,) = % + 



(I - fl) 1 fl I - 6 6{t - 8) 

ft* H 2 

In Polsson's model fix; 9) = »"* — and ln/(x; O = — . Consequently. 

x! as* e 2 

/<»= -E..T, ei 

S J 8 

In the model ff*(/-, S) we have yfrn fl) = C'm- i9*0 - »Y and 

d 2 x r 

z ln/(*; 8) = -~ + Consequently, 

afl 1 a' (1 - 8)* 



80 -ey (j _ g)i s(i -$f 

2.44. In this case we have 

Infix: 8) = ~ J ?~ JlL - | n (fcVSr) 
Mi 



and 



- E '(^ ,n/i[Jf ' ;S) ) -g*»«-*o-tt 

This face and the previous results 6' v « us the required formula. 
1.45. We find the form of the matrix I(tf) from the formulas 

W Pi P« ' 

3 2 ln/(fc «J) i(f, ojv) 
dpi dp; p ft 

We can directly check that I " '«?> = E„_ , , where the matrix E„- , i s defined 
as in Problem 1.53. 

2.46. Suppose that &~ is an exponential model. Then for $ = (0, .... 0,1 
we have 

«c* * -£i.*« •-!; 4 imp* « 



as, ^; 



To Chapter 2 173 



where 7*X) - £ B iX>>- Wc wkc *»/*> = ( ■ -gP) . J » «» - — « J" 1 " 
write '" ' \ J / 

Conversely, if we hove a representation s'(*)U(X; tf) « T»(X) - *#) for some 
a(£) - (adS), ..., aAfi)), r„(X>, and if*), then we have a special case of 

j. i 

This means that the function fix; »} has the required form, 
[n Older to obtain the variance D ( r , we recall that 






- £,lT*(X)l/j(X; «)) = cov, (t'(X), t^X; fl» 
because E 8 [/j(X; *) = 0V«. Therefore, 



J-i ' 



M^>= cw,(t', «'(*)tJ(X; «) 

= cov, (f ', t" - t{B)) = D.t'. 



2 47, Since the variance of the efficient estimator coincides with the 

Cramer-Rao bound, we find from the previous problem thai = — — — , 

A ' (0) tip) 

which gives the required expression for i(9). Vfe also have 

m = _e, gli n/ ^ : . *> = -a -(wim - cm = SlSL a -<« - cm, 

i.e., E»Wfl * ~C '«)«'(*)• 

2.48. We directly apply the results of Problem 2.46. For example, in the 
model fli'(r, fy we have 

/(jq fl> = exp [XlnB + rln(l - » + lnC?*,-,), 
ixs.. -4(« • In 0, B<J0 = x. Off) = rtn(l - 9). Consequently, 



,(8) = -C '<«)<«*<*) ■ rtf/Cl - % t* = X, Dit* 



nA'16) n(l - 0T 
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2.49. We have -g -l n/ <*l i> = 2f(xr, 0). whence 

as 1 

K6) = 2 I /*(*; S) dx = 2 \ — iS. : i. 

J J <1 + f) 4 3 

o 



Then 

D ( $ - E^f - &) 1 = (jr - «'/(*; 9) dx 



I 



-r^e-^l + e~ I )- l dx m 
From shis we find 



OtX 


3« 


1 


= 3 
n 




>t]QWS 


Trom 










a* 


<T 4 

+ 

it* 


a^, s 


1 = 


r - 





and the Bhattacharyya test. 

2~5I. (I follows from Problem 2.21 that KeT* = 9 1 , D,T" = Z(2X *. + 3 * fl'. 

XnfXfl + l> 
The tower Cramer-Rao bound for the function t{8) = fl 1 is (see Problem 2.43) 

— , which is smaller than D<7"*. Therefore, 7* is not an efficient estimator 
\n 

for t<9). The optimal ily of 7"* follows from 

j_ /«» at + e' a^\ ^ * ^ , 

i \Xn 30 XrtfXn + 1) SB 1 J \Q,n 4- 1) 

and the Bhattacharyya test. 

2,52. The optimality of (he estimators follows from 

6i a In L 

tmi 

and the Bhattacharyya test. The information matrix for the model ^f(9t. 6i) 
was found in Problem 2,44. whence we find I he Cramir-Rao bound 9\/n for 
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the function ti(0) ~ fli , which coincides with DeX. is.. X is an efficient estima- 
tor for ti(9). For the function n{B) = S\ the bound is 2Spn, which is smaller 
than DotS' 1 ) ■ 2flJ/(n - 1) (see the solution to Problem 2.14), i.e., S' 2 is not 
an efficient estimator for n(t>). 

2.53, lr is clear from the previous problem that the sample mean of the 
common sample is the best estimate! for the mean, i.e., the statistic X = 

- (niJfi + n*Jti), where n - ni + «i. We also have 

" e 2 — 

»,Jf = — < min(D,J(i, D.-ti) = e£/niax(ni, n 2 ). 



Similarly, the statistic 



S' 1 = -iAnl ~ X*) 

It - I 



is the best estimator for the variance and takes into account all the information. 
Bur nA„2 = n\A$i + "i^n?!- where A^> z is the second-order sampling mo- 
ment of the ith sample, i = 1, 2. From the formula 



m - 1 



we have wA% = (*t - W a + rnX?. We finally find that 

S' 2 = — <(ni - 1)S>' 2 + (in - 1}S{ 2 + mA? 4- mXl - nX 1 ) 

n — I 

and (see Problem 2.14) 

D,(S '*) - — — &\ < min (D.IS,' 1 ), D,(Si')) = ■ 



n — 1 max (n>, «i) — 1 

2.54. (1) Suppose that X is known. We are dealing with the exponential 
model with (see Problem 2.46) 

B{x) = x. A(8) = -A, C (fl> = ~, 

2$ 2 * 

where = f- The efficient estimator here is r* = X and the respective para- 

.1 1 It* 

metric function is Tip) = p. We also have Dt* = — IMTi = = — . If X 

is unknown, we calculate " "^ C* "* 

3 In L Xn — 
— — = -j (* - *) 

and apply the Bhattacharyya test to obtain the required result. 
(2) We are dealing here with the exponential model 

B{x) = 4 + -• A <& = "|. c < 9 > - * + I ln e > 

y? X 2 ft 2 
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where 6 = X and, according to the solution of Problem 2.46, ihe statistic 

t* = - ). [ -- + — 1 is an efficient estimator for r(0) = - + — Con- 
" ■ £ — 1 V xj X /. 

J-f 

sequently, the soughL-for estimator has the form t" — 2/(i. 

2-55. Using the Cramer-Rao inequality for scalar estimators For all c e Jt n , 
we have 

D«(cT> 2s b'(fl)L, _I (P)b{e), 

where 



bW = L<m t .*w\ 

m 

-(2 



Thus, 



9nC« V" 3 "W\ »«, 



c'D«(T>* S* C'B'(«)lT l (e)B(fl)c 



This means that the matrix D<(T) - B ' (fl)LT ' (WB(fl) is non-negaiive definite. 
256. If an efficient estimator exists for r<0), then the model is exponential 
(see Problem 2.46). Therefore, 

i(x; e) = exp (>t<fi)r(i) + ncxe) + 2 ^Wl. r W - S fit*) 

*-i 1-1 

and, according to the factorization test, 7TX) is a sufficient statistic. 

2.S7. The sufficiency of T„ follows from Problems 2.S6 and 2,48. We now 
test it for completeness, i,c, show that only rhc function <e for which itfj) = 0, 
/ = 0, 1, ..., rn, meets the condition E.t<p{T K ) = vO i (0, 1). Due to the 
reproducibility, -4<r„) = Bi(kn, 8). Consequently, we have 

kit 
1-0 

and the completeness condition is equivalent to 

An 

/, ^DCi.x' = Vjt > 0. x = 



1 - $ 



Since the polynomial is identically zero, all its coefficients are zero, which we 
wanted to prove- The statistic T„ being complete, we conclude that the unbiased 
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estimators only exist for the parametric functions iffl) of the form E)/f(7"„), 
i.ii, are the polynomials in 6 of degree ^ kn. Specifically, the statistic iT„)i/{kn)j 
it an unbiased (and hence optimum) estimator of (he degree 1 for / ^ kn (sec 
Problem 1.52). and the linearity of the optimality property implies that the 

unbiased estimator for the polynomial H9) = 2j "ft 1 at r *S *" has the form 

J-o 
given in the statement of the problem. This generalizes the results of Problems 
2.5, 2.7, and 2.8 making them stronger. 

2.58. The sufficiency of the statistic r„ follows from the results obtained 

in Problems 2.56 and 2.48. We have -A(T„) = n(/iS), and therefore the com- 

m 

ip(k) — — — = vS > 0. But since 

the power series is identically zero, all Irs coefficients are zero. The complete- 
ness condition is only met by the function ip for which ^ik) = o, k = 0, 1, 
2 t .... This means that the statistic '/'„ is complete. It follows from Problem 2.9 
that the statistic {T n )j.'ir' is an optimum unbiased estimator for 8*. Taking into 
account that the optimality property is linear, we get the last assertion of the 
problem. 

ZS9. We find from the previous problem that the function HO} = 

m 

^] (fl(z — OV/yi has an optimum estimator of the form 
j-a 



T = 



= Zj 



jV 



- 2«.^-(-^) K . 



for the function t*<0) = / A {— iy ^ -; - the optimum estimator is 



~k\j\ 



j-a j-a 

for the function rAfi) ~ 2 **(P) i' '* 

fc"r 

1 2— 8S<J 
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2.60. (1) We have /(x; 8) - enp {A{8}B(x) + C($) + D{x)\ tor A(&) = In 0, 

B(x) = x, C(fl> ln/(fl) r Dix) = lna(*). The required assertion directly 

follows from the result of Problem 2.46. 

(2) The likelihood function 

n 

and the factorization test prove the sufficiency of T„, To find the distribution 
of 7"„, we recall that its generating function (see the hint) Is 

£,z T - - v"(z; 6) m f(,zevr<,8). 

Extracting Tram the right-hand side the coefficient at z\ we obtain (he reauired 
result. 

Now lei <e{t) be an arbitrary function defined on the set ( nl, ni + 1, . . . 1 
and such that E fi p{T„) = OvBiB, i.e., 

2 <e(t)i>«(W = o vs € e. 

It follows that $j(r) = for all / with bi,(t) * 0, i-e,, fit) = on the set of 
all possible values of the statistic T"„. Thus, T„ is a complete sufficient stalls! ic. 

(3) By (2) it is sufficient to verify that, Ej rj = 0'. We have 



E »'» = Z-i Uh r ' lT " "'* " ^J M ' ~ J)9 ' 



"//"(fl) 



= 8' 



/ t bJMF/f{e) = fl' 



because jj b »W = A*)- 

The optimality of the estimator r*(s) is provided in a similar way. 
(4) Since the optimally property is linear, we use (3) to find 

m r„ — i 



;-» 



Finally, if r„ S (« + 1)/, then we use the hint to find the estimator for f. 
2.61. This model is a special case of the model from Problem 2j60 (for 
JW ~ e' - 1). We therefore have t* = b„(T - !)/&,( 7), where 

M« * coeV(e* _ l) B = 2 (-l) n "'C„A*/Arl - a »(*/*'. 

for * £ n and 6„<*) = for k < n. Whence follows the required result. 
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2.62. Suppose thai in Problem 2.60 /(W = <1 - H) ', liking into account 

the expansion ft?) = (1 - 6)"™ = 2 Ck*.-^' (see Problem 2.11), we 
find ' - ° 

(this is a stronger result compared to Problem 2.11). 
Since n(fl) = f~ '[(f), we find that 

(I H + rn — l)r 

as in Problem 2.60. 
We finally obtain 

(7-.), <™ - I), 



T 5 - Cr<!+r„-J-j-l /C rnt-T.-1 



(r, + m - I), (T„ + rn~s- !)/ 



2.63. The hint to Problem 2.45 implies that the function /(or; 0) can be 
represented as 

fi.x\ ff) = exp | 2 ^. «> + In P« j ■ 

Pi 
where 0/ = In — — , j = I, , , ., W ~ I. 

By the test for an r-parametric exponential family (for r = N - 1) it follows 

n 

that T = (T, Tw- [), where 7) = 2 W. «/) = *W = 1 AT - I, 

i-i 
is a minimal complete sufficient staiistic. Consequently, in the given mode! 
the unbiased estimators only exist for the parametric functions of the form 
E ( //(T). The class of these functions coincides with the class of the polynomi- 
als in pt , .... ph of degree £ n (sec the solution to Problem 2.29). It follows 
from Problem 1.52 (b) lhat the statistic t" = ("i)#, ■--('■j»)* / /('0* i + **,, is 
an optimum estimator for t(0) = />*' . . .p$f for ki + ...*■ kit £ n. The esti- 
mators for arbitrary polynomials are constructed from the linear combinations 
of these statistics (because the optimal ity properly is linear). 
_ 2.64. The model .4' (8. a 2 ) is of an exponential type, and its sample mean 
X is its complete sufficient statistic Therefore, T" is an optimum unbiased 
estimator for the function r(0) = fi 1 (compare to Problem 2.50). The optimality 
of the estimators in Problem 2.16 is proved in a similar way because T* is a 
complete sufficient statistic for the model .'''0*, 0*). 

2.65. (l)WehaycEB7"i = Pt(J ^ jr») = r<0), i.e,. Ti is an unbiased estimator 
for lift). Since X is here a complete sufficient statistic (see the solution to 
Problem 2.64), the optimum estimator can be computed by the formula 

t' = Ei(r,|S) » P«(*i - X*Z X, - X\X). 

I J* 
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But X\ ~ X and X are independent (see Problem 1. 56) and -4(Jt\ — X) = 
^(o, l^Ll a 1 J . Therefore, 

7* = p 9 (^> - x ^ *> - jo = * ( J g " t ** ~ ^ \ . 

{2) In the first case we have r,(0; a 1 ) = E,d?)(J) = exp {it& - (l/2)ffV|, 
whence r\ = exp {itX - {V2)tr\t 2 \. In the second case we have n(8; <**) = 
e 1 + a\ whence tJ ~ X 1 + a 2 (compare with Problem 2.64). Finally, 

wc have ti($; tr 3 ) = * ( I , and the estimator for tJ coincides with that 



{-)■ 



obtained in (1). 

2.66. Mete the function f(jr, fl) can be written as 

Ax; B) = exp [Six + $&* + c($;, 8{>). $; = __!_, ${ = _ _L, 
By the test for an f-parametric exponential family it follows that 

i n 

1 / j ■*;■ / j -AT? I is a minimal complete sufficient statistic. The 

equivalent pair (A*, S 2 ) is also a minimal complete sufficient statistic since the 
two statistics are in a one-to-one correspondence. This means that the estima- 
tors in Problem 2.20 are optimal. 

2.67. Since E,(S*) = ^-i w = ^i 7 V <see Problem 1.27) and 

E,(S^) = e 1 + - 7 V = " + y e 2 (see Problem 2,13). we have E»*<7> = 
n n 

for all 9. This means that the test for completeness is not fulfilled. 

2.68. Let X = (Xi Jf„) be the respective measurements. Then X is 

a sample from the distribution . 1'(6, , fl|), and we arc estimating the parametric 

function r(9) = — 6\, Using the solution to Problem 2.67, we find 
4 

E,(x i -—i—sA -,l + Lgi- i » . -* «} = $, 

\ it — l / n n - I n 

Since the sufficient statistic T = (x. S 1 ) is complete (see Problem 2.66), we 
see that the statistic r" = -^ I X 1 S 1 J is an optimum unbiased esti- 
mator for r[0). 

2.69. As stated, the conditional <p A {T) = P«(7\ $ a\T) and unconditional 
*W = Pa(Xi € A) probabilities of the event A do not depend on the parameter 
ft We also have E., VA (T) = y A , ijt. E,^(7~) = v9, where g(T) - <p A (T) - y A . 
This and the completeness of the statistic T give <fi A (T) ■ y A , which means 
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thai the conditional and unconVlitional probabilities coincide and the statistics 
r, and T are independent. 

2.70. Here t 1 •* Xl + Xl + Xl is a complete sufficient statistic (see the 
solution to Problem 2.64), and T, = /(Jfi sS *a) is an unbiased estimator for 
t<S). Therefore, the optimum estimator is 



ErtftTO = P, f^ =£ 2 rj. 



The statistic q = AV7" = Ki/Wf' + xl, where K = X,/a. i = I, 2, 3, 
^J =, yj + y| t has a distribution independent of the parameter 8 (because 
Ji(Yi) =,>(0, t>, I » ir % *> atvd - according to the Basu theorem (see 

Problem 2.69), i> and 1" are independent. We thus have r = F % t — J , where 

F,(u) - I'd) ^ u). The distribution function F,(«) was found in Problem 1.58 
for an arbitrary sample size. We take n =■ 4 to obtain 

1 / e . l\ J + U 

W-i-^-M.iJ- — 

f or ^ » ^ 1 and 

F,(u) = 1 - F,<-«1 • 

for - 1 =S " « 0. We thus find S'(i) 



K> : ?) 




2.71. We introduce the random variables Yi = » — . / = 1 «■ whose 

tb 

distributions arc independent of 8 = (fli, &i>. Then 

x, - T r> - r . 

= , > =■ i, .... n, 

50t) 5(Y> 

i.e., the distribution of the statistic V Is independent of S. Since T = (Jf , S J (X)) 
is a complete sufficient statistic Tor the model . / (ft, &h (see Problem 2.66), 
the statistics rand (/are independent by the Basu theorem (see Problem 2.69). 

2.72, Since E*?\ = PAX* £, to) = T»), 7\ is an unbiased estimator for 
K6). ConseqwenUit. the optimum estimator can be found from (in what follows 
7 = (X, S 1 -)) 

t" = E*(nm = p«<*» < .»i|r) = *V? ^ «»P>. 

. According to the solution to the previ- 

■Jn - 1 S 
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Otis problem, the siacistics tj and 7"are independent and, therefore, t" = /\{ua). 
The distribution function of the statistic 17 was calculated in Problem 1.58. We 
use this result to obtain 

-la 



f| - «*; *Z_* 1 \ for X < Xq, 
1 B ( 1 - «J; ~1, I) for AS* .«,. 



Published tables for the beta distribution function B(s; a. b) will help in the 
calculations. 

H 

2.73. The model T($, X) is of the exponential type, and T »e ^j JCj is its 

complete sufficient statistic. Then the estimators in Problem 2.21 are optimal. 
In order to prove the second assertion, it is sufficient to show that there is 
no (unction H{ T) which meets the condition E»//( T) = fl "* Vtf > 0. This con- 
dition can be written as 



J Httofa 



•dx 



where z = 1/ft W.W * #M*** "Vrftfl J i 

If m = a — X« + I is an integer, then we differentiate the identity m times 
with respect to z to obtain 






e'^rfe = 0. 



The integral is ihe Laplace transform of jr m //,{x>, and therefore for * > we 
have //(*) = 0. 

2.74. Since 7" is a complete sufficient statistic, It is enough to show that 
the estimator t' is unbiased. We have ^s(7) = r(0, Xn) Consequently, 

E,»s(/7-)= J^<^) ; r >u '" , e' i/ ''rfJr/(^(X^)fl , '") 




and 



We change the order of integration and get 



f integral i 

» C 



,:y = nxr '■ L *'" / 
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Specifically, if a > ~X, then the mean Et<rtl) exists for all n>(x) = x" and we 
have 

_ ,_ r(X + h} „, 

lit) = E»rf8 - l ■-- - ■ e: 
I w 

Therefore, 



r(\n)rt> + X) 

T - _ _ .:.::..' i :■- - ' '<- - ■ -r, = — : — :_^_ 7", 



Or(X(n- 1» J 



r(X)r(X(n - i» J r(X)r( D + x«) 



o 



For a = 1, 2 we arrive al the results of Problems 2.43 and 2.51, respectively. 
We find a more general result in a similar way, i.c, if i(6) = B*e"' / '. 
> -Xrt, X ? 0, then its optimum estimator is 

TOW) jT-xf*- 1 

rrxn + o) 7** " " 

2.7S. "We have t(8; f) = E»/« 5= = P»(f ? 0- Consequently, 



r(X(n - I)) J 



r(X)HX(n 



/(*T Js fjut^-'d - *)**— a ~ l <te 



Here /(-t7" ^i) = t * JT > r/7; If 7" $ r, then the integral is equal lo zero, 
and if T > (, then it is expressed in terms of the beta distribution function as 

t **-"(! - x) yv - »* ' dx = B(X. X(n - ty( 1 - B f £, ; X, X(n - 1) \ J . 



r/T 

Specifically, for X ■ I we have 






which gives the required result for the exponential distribution. 
2,76. "We write the likelihood function in the form 

According to the factorization test, T is a sufficient statistic; We look for its 
distribution. Noting that 

Pi{2(s/e) i < x) = p. ft =s e (|Y V = i - e-*". 
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>£.. SiQWB)*-) = x \2), we find >;(277p*) = x^ln). The distribution density 
of T has the form 

fjixt = J" ' *-*?, x > O. 

The condition E (w (r) = vtf > has the form 

| tpixpc" 'e ~ ™ rf* = V? > 0. 

B 

It follows that ?(x) - 0, x > 0, i j„ ihc statistic 7" is complete. It is now 
sufficient to verify that E,t' = t(0). We change the order of integration to 
obtain 

E#r* = {n - 1) 1 (I - /)"" A 1 «*(M)"'VHx-)<fr| <« 

■^W-[J--*a-)"^i* 



" •*) 



rWi^-'e'-Wf dy = E.^a = r(B): 



i 
For H«) - x* we have t{0) - ff* and t* = (n - })T f ((1 - r>"" 2 tf/ = 77«. 

o 
2.77. By using the indicators, we write the likelihood function as 
it 



= — fl^tn S *ij exp 



The required assertion follows from the factorization test. We find the estimates 
S\ - {nX w - X)/(n - 1), 6>; _ („/(„ _ m X - X (t >) with T> t 8[ . 
«|/(n(w - 1)), D^Uj = ff|/(n - |). 

2.78. Similarly to Problem 2.77, wc write the likelihood function as 

la I 

We see that A<i> is the only one-dimensional sufficient statistic which exists 
if only/(jr/; ff) can be expanded into the product of two functions one of which 
depends on Xi and the other depends on 8, ijc, if f{xi ff) — g(x)/ti(fi). 
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2.19. The solution to the previous problem shows that X M is a sufficient 
statistic In order to verify fis completeness, we must find the distribution of 
X ln . We have 



If 



pit**, < x) ' pjc^i 4 *> = ( ^y. o ^ * ^ 



0. 



„»- I , 



E.v(Xo,)) - — l ¥>(*)*" " ' tfjf = vfl > 0, 



then we differentiate the identity t <f4x)x? ' dx • with respect to 6 and 

o 
obtain <tf$) = 0, > 0. This means that JC ( „) is complete. Since 7* is an un- 
biased estimator for (see Problem 2.24), it is a function of the complete suffi- 
cient statistic and hence is the optimum estimator among all the unbiased 
estimators. We also find 

E.m - m x = e.[x<7* - s) + o. - m 1 

* x 1 D.r" + (>. - i)^ 1 - *(we 2 , 

where (see the solution to Problem 2.24) vt(X) = — — — - +■ (X - I)*. We have 
min ^M = «X*> = - * -^ and \' = -^3- Thus, 



(n + I) 1 n(n + 2) 



o,r 



-. n + 2 „ 
is a mean square error for the estimator Ty = — Xo,,. 

H + l 

We prove the optimal ity of r° by verifying the relation Ear = t(0) v8 > 0. 

2 JO. For the model /?(0i, 0j) the likelihood function can be written as 
L(x; «) = /(fli S X(„>)/CX(o £ #iV(Si - Si)*. Consequently, T = (X Wr X w ) is 
a sufficient statistic whose distribution density T was given in Problem 1.36. 
We use this result to find that the condition E»*i<T) = v0 is equivalent to 

J I *>C*i. **)(*» - jti)" " ' d Xl dx, = V0. 

Differentiating this identity rust with respect to St and then with respect to 
0i, we reduce it to the identity v(0i. foHfo - Si*"" 1 ~ **■ It follows that 
*>(0i, 0i) = v0i < 02, i-e.. the statistic T is complete. The estimators con- 
structed in Problem 2.25 are optimal as the functions of T. Finally, since 

n0i * ft 01 + «0z _ . . 

EfXpi = — ■, W*Xiw> = (see Problem 1.36). the statistics 

n + 1 « + 1 
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{nX w - jr (n> )/(/i - l)and(«AT<„) ~ Xw)/(n ■- 1) are unbiased, and hence op- 
timal, estimators for the parameters 0, and 82, respectively. 

2.81. Like in the previous problem, we write the likelihood function as 
i(x; 8) - /(*<,) 3s a(«))/(6(«) 5s Jfw»)/(6(9) - a(0)) n and see that the statistic T 
is sufficient. If wc have o(9)T. b(8)l as 8 grows, then 

\x ( .ii 5s o(«), x ( „) < £(0)| •> !« £ o- 1 (j:(i,), # < 6- l (jr (nJ )) 

* l» € r,(x) = min (a - '(*,„), o" 1 ^,))), 

and the likelihood function can be written in the form Z,(x;. 6) = 
t{T,(x) S 8)/ib{8) - a(8))". This means that in our case a one-dimensional 
sufficient statistic exists and has the form TifX) = min (a - '(Jf ( iO. ft" ' (;*"<„))). 
Similarly, if we have ate)i, 6{0)T as ff grows, a one-dimensional statistic exists 
and has the form Ti(X) = max (a " \X W ). b ' 'iX M )). There are no other one- 
dimensional sufficienl statistics in (he model R(a(8), bifi)). 
For the model R(~ 8. 8) the statistic is 

nm = maxt-A - ,,,, *,„,) = itiax(jjr,„i, l*<n>|). 

1.82. The condition E*te(Jr) = V0 can be written as 

(1 - 8) 1 *— < 

or (see the hint to Problem 2.11) as 

*>«>> + 2'tfW + jm-i)]** -o. 

The functions *> for which ^O) = 0, >Kx) = -x<fi( - 1>, jc = 1,2 meet 

this condition. The only bounded function of this type is <fi{x) = 0, x = -I, 
0; 1, .... Thus, X is a boundedly complete sufficient statistic which is not 
complete. 

2.83. The distribution of the data p = tin >»») is in the set jS> of 

the vectors I = (/, , . , , l„) with integer non-negative components, which satisfy 
(he conditions 

/i + ... + /„ s£ N, /, + 2h '+ ... + n/„ ■ n . 

Vve find the likelihood function for P«( M ■ I), I e ^.,, The total number of 
possible outcomes of the experiment is N", and the number of the outcomes 
compatible with the event [„ = I] can be calculated in the following way. We 
first register the elements k ~ /, + ... + /„ of the population U, which are 
in (he sample. This can be done in C% different ways. For every such subset 
the number of ways to form a sample from the elements of the given subset 
with the given value 1 of the statistic /i is the same under the condition lhat 
aJI the k elements must be in the sample. We denote this number A (I; k, n). 
Then (he total number of favourable outcomes is C%AQ; k, n) and by the classi- 



To Chapter 2 W 



cal definition of probability we have 

PnC* • l> • S(k; N)A(i; k, n\ 

where the factor g(k; N) = C%N" depends on the parameter N, and on!y 
depends on the sampling data through the value of the statistic n which is 
Jt = /i + , .. + /n-compaiible with [he event [p = 1], while ihc factor AQ; k, n) 
is independent of the parameter M By the factorization test n is a sufficient 
statistic for N. lb prove the completeness of tj, we must show that >t>(k) = 
for all possible values of the statistic v <v/V ? 1) for any function v(k} from 
EwvsM = WV, The distribution of i> is (see the solution to Problem 2.37) 

k 
P«<»> = *) = «<*:*) 2 (-1)*- J 'C{A*« -<*'<M * U,2, ...,min(#i, W\. 

\ft verify that <p(D •= <P&) - - = f(") = 0. If N = 1. then 
Eip(i,) = *>0)*(i; I) - *(U - 0. 

If N - 2, then 

E^r,) = p(l)Pi(n = 1) + *>(2)Pj<i? = 2> = *>(2)8C; W " 2 > = °. 

i.e., *>(2) = 0. Similarly, putting N - 3, we find from the condition Ew>M = 
and the equaiions *(!> - *<2) = that <?<3) = 0, etc. We thus see that 
?(k) - for all k s£ n, and the statistic tj is complete, 
2.84. The efficiency test implies that 

whence follows that fa satisfies the equation t(6) = r. We calculate its second 
derivative 

3MnX.(K; S) _ fl(fl)f , {9) * (f* ~ Ttf))Q'(9) 
30* " a 2 (9) 

to show that it is unique. Since we are dealing with the exponential model 
(see the solution to Problem 2.46), D.r" m «(»r'{S) > and, therefore, 

d l In L(x; B) < Q> i£ ^ evfify sohujo,, f T (j) = T ' is a local maximum of 

the likelihood function. For more than one maximum there would have been 
a minimum between the maximums f i£., at the points Of minimum which 

also meet the equation 7(0) = r" we would have > 1 . Since there 

09 / 

is no minimum, we conclude that only one maximum may exist. Accordingly, 
we arrive at the following table for the values of fa in some of the models 
(use Problem 2.48): 
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Model / (*. e*> i (?, e 1 ) r(«, X) s(», i) Bi(k, e) n(«) B/tr. 9) 

ft. X T, ~X/\ n X/k X X/(.r + X) 

Here 

2.85. In the distribution ..-*"(«, <t*) the point 8 is a theoretical median, and 
therefore the sample median T H is in the given case a consistent estimator for 
8 whose distribution, according to the solution of Problem 1.32, satisfies the 
asymptotic relation -if(7" H ) - A\8, T<^/2n), i.c, aj{8} = ira 2 /2. Then 8, ■=* X 
(see the solution to Problem 2.81) and -t%0„) m -^(0, oVn). From this we get 
eff (7"„; p)^= o Vp J-(fl) = 2/t = 0.631 This means that for large n the sam- 
ple mean X for the sample of size n ' = 1n/tr estimates ft with the same accura- 
cy as the sample median X^ f2 \*-\i of the sample of size n, whatever the values 
or 9 and tr 2 . 

2.86. We have 

Uf. 8) = —pi -exp J — ^ y]<*< - *')*{ 

(V2^ fe)" I 281 *-* J 

" ~7=i =xp J — ^r (s l + & - fa) 2 )}, 

n tj 

where s* = - > (*, - jf) 2 , * = - ^ . Xi . The likelihood equations 

j - i i - i 

aini aint 

= = become 

x = 8,, 9} = s* + (x - 0,)*. 

They uniquely define the solution 8 => <Jf, $). Lei us show that this is the point 
of maximum of the likelihood function. We rewrite IA*i 8) in the form 

L(x; $) = (2*es 2 )-"*txc> [ -*#CS *». 

where iHx, *) = tt— + ;r(-^--l)~ln— . We shall minimize the 

function ^<i; g) in 9 ■ (flj, 02>- The estimate In a ^ a - 1 va > is easily 
found. Putting a a i> 2 . we obtain the estimate In b ^ (A 1 - l)/2 Vft > (the 
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equality only holds for b = 1), We now have \«*; *> > (the equality only 
holds at the point 9 = (4, $)), Consequently, £<x; 6) ^ Puts 2 ) - "" 11 (with the 
equation at the poinl 6 — Qe, J». We have thus proved that $~„ ■ (Bin, 6zn) exists, 
is unique, and i„ m {X t S). 

2.87. We find the form of t, from Problem 2.86 and the invariance 
property of the maximum likelihood estimates. For n -* « we have 
vS(vn<T,, - rit})) - . / (0, o%6)), where (see Problem 2.44) 

-J*. (Ab-a^l 



- e 



2t 

I.S8. The form of 0„ is given in Problem 2,84. Using it, we may write 



V2T 






where ri is the unbiased estimator for 6 obtained in Problem 2.16. We find 
Esft, = c»e c = -«-. Stirling's formula for the gamma function 

r(z) — %/2tzz i_ "e" 1 , I -» «, gives c -* 1 as n — o=, and we see that 8„ is 
asymptotically unbiased. We then have 

B t §. = E^l - (E.&,) 1 = e^i - cl) — 0, n - «. 

whence follows that ft, is consistent. 

From Problem 2.43 we infer that _rS(Vn<&, - »)) - -A (0, / " '(«)) = 

; (0. fl 2 /2) as « -* oo, and consider the estimator T n = I- — ^7 i |*i - el 

(see Problem 2. IS). Using the Central Limit Theorem and I he solution to Prob- 
lem 2.1S, we find that Si(T„) - l ( 9, ^— « l ) as n - ». The asymptotic 
efficiency is V 2n / 

„Z /_ -: 1 

- - 0.88.... 



eff<7-„; A = ^l 1 ^ 6l " 
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2.89, The likelihood function is 

i- j 

The likelihood equation is 

n 

9 J + 2fl - T„ = o, r„ = - X)*?. 






This equation has the only positive solution t„ = VI + T„ - }, which max- 
imizes £(x; B). By the law of large numbers T« converges in probability to 
E,X, = [►♦A'j + (E,X,) 2 = 20 + P 1 as n — ». Consequently. 



d, -I Vl + 2$ + ft J - I = ft 
2.90. The distribution density of the observations is 

2» I. 2V(l- e >) -»(1 - 8 ») 2 ml<M ' 0,, j- 

Taking into account the hint, we may write 

Ax. r. q> = — e*p ! qi(jc* + ^) + <7uy> + r<q){, 

Zlr 

where r<q) m - ln (4^J - q\}. The likelihood equations from which we find <Ji 
and 4i have the form 

£$> + ;*>——-- <!** 

- X j *'>■( = — s— — r« 

" 7T7 ** "«' ~ «f 

Since <7 = — — — -, e = p the equations directly give the required 

*?f -«3 2ffi 

maximum likelihood estimates 
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2.91. The distribution density in our model is 

/(x, y, 9) = — — I. - «p f ^r<* 5 - Wxy *?M* 

2ir(l - 9 1 )" 1 (, 2(1 -8*) J 

therefore, 

Ini = -*i In I2ir) - - In (1 - 9 s ) " — f?u - 2»T„ + 7ii), 

2 2(1 - ff 1 ) 



where 

T,, 



H n n 

- y] *?. »« -- >c *»*> r " - - z] j* 



We thus find 

L2L*: = nS _ , f! ctm - wr« + ftd + — ^ r »- 

3<? 1 - 9 J - «V 1 - 8 

The likelihood equation is reduced to the form 

9(1 - 9 J ) + + *Vii - *(7" + I'm) = 0. 

This cubic equation has three roots, two of which may be complex. If all the 
three roots are real and belong to the interval (-1, 1), then the one which 
maximizes the likelihood function L is chosen as 6*. We use the substitution 

9 ^ x + — T n to reduce the likelihood equation to the canonical form 
x' + lp„x + 2q, m 0, 3p„ = Tn + To - - 7?» - I, 
2* = ftt (T„ + TrO - |r 7Ti - I j ■ 



If the inequality /?£ + 5* > holds, the equation has one real root {see the 
solution to Problem 2.19), which is obvious for p„ > 0. Since the sampling 
moments converge in probability as n -> » 10 the respective theoretical mo- 
ments (see rhe solution to Problem 1.38), pn converges in probability (a the 

1 
quantity — 

n 1 he likelihood equation has one real root which is the maximum likelihood 
estimate 9*». 

The asymptotic variance of S„ is l/i»(9), where 



(l + 1 _ I fl 1 - 1 \ = - ( 1 9 1 1 > 0. We see thai for large 
3 ) 3\ 3 ) 
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S 2 ln£. 



f„{fl) = -E, ■ 



ae 1 



„ T 1 + (S 1 49 1 + 3B* 
= ~" E ' 7 ^ + 7, 7S3 Ta ~ Z TJ < r " " 2gT » + r "> I 

~ n L a - «¥ + (i - e 1 )' " (i - flV C2 29 J ~ " (i - ey 



Thus, 



yJ(V«(*. -«»-.., (n, <LjL^£) 



as /j -» oo. 

2.91. By the Centra) Limit Theorem the statistic T„ is asymptotically normal 

with centre at $ and the asymptotic variance— D«t*i Yi) ~ - lEtVC* y?) - a 1 ]. 

n n 

Therefore, the problem is reduced to calculating the mixed moment E«(;r? Yf). 
Here the characteristic function is 

«*., t2) = >«»***«» = cx p f-i- (,i + 29,,^ + ,1)1 

Thus, D^i Ki) = 1 + fl 1 , and Che asymptotic efficiency of T„ is 
2.93. (J) Consider the likelihood function 

i- i 

- ie&m"« «tp f-| 2 <* - *nr "(x, - rt ] . 

Here 

S (*- »»j'E"'eo - n) 

'- ' 

= 2 <*- *>'£*'(* -S) + »(S - jO'E-'ff -ri- 
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Easily verifying the equation y' By = tr (B\), where V = yy ', and ihe linearity 
of the operator tr, we obtain 

H 

2 (» " WE'Hxi - S) = ntr<E-'£(x)>. 
i- i 

These relations give the formula from the hint, which implies that by maximiz- 
ing (he function i(x; 0) in S, we minimize the function 

,K* : „, E) = (X - ttyT.-\x - p.) + [tr (£-'£<*» - * - In |E-'£(x)|] 

in ia and £. We use X] , . . . , X* to denote the roots of the characteristic equation 
|E " '£<*) - XEt| = or the equivalent equation |£(x) - X£| = 0. The expres- 
sion in the brackets is equal to Xi + ... + X* - * - In [X] . . . X*), and we may 
write 

*- 
Wt; k £) = (S - fi)'L-'0( - >t) + 2 <X< - 1 - InXi). 

Since the matrix E ~ ' is positive definite and X - I - In X 3= VX > 0. (he 
latter relation gives i)ix; p, E) ^ with the equation holding only for <• = 5 
and X, = . . , = X* = 1, ixi, when E = £(x). 

(2) According to Problem 2,4, (he statistic ■ St, is an unbiased eslima- 



'(■A"*)" 



tor for <jj/. Consequently, El £ I = E. 

V.n-1 / 

(3) To find the required expression for max i(x; 0), we must take into ac- 

9 

count that tr (£~ '(x)£(x)) ■ tr Et = *. 

2.94. We calculate the moments EjJff in the following way. Suppose that 

<o(f) = Etc" r - m exp j rrt, - £ ef J . Then 

E„JC* = E^e* 1 "' = * (*) = «P i**i + y *t] ■ 

Whence 

ti(9) = E.Jf, = exp Ifl, + 4/2J, 
ts(8) - D,A- t = E.Jf? - (E^,) 1 = rJtBKe* 3 - 1). 

Since (rf M , $!„) = (y, 5*00). by the invariant* property of maximum likeli- 
hood estima(es wc have 

ft* - exp <F"+ S*(¥)/2], ft, = ?f„<e sl < v > - 1). 
Sincc F~and S 2 {Y) are independent (see Problem I.S6), we have E ( ft„ = 
B i e i W ,w *. Here _4(T> = ^(*i. *!'"> and. as before, E*e*' = 

13—881 
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exp \6, + e|/(2rt)). We also have ^rS(nS 2 (V)/ef> - x \n - l> (see, for example, 
the solution to Problem 1.58), whence Eje^'"' 1 = if0j/(2in)), where WO = 
Ec"*"-' = (1 - 2iS)~ < "~ n ' 1 (see the solution [O Problem 1.39). From this 
we Hnd E ( e s W = I I - _ 1 and, finally. 



n 



as 

2.96. Here 






= y - nrmtm ~jm-*m 



likeli- 



with ^(6) = ef'{B)/f(&) = 2 *»<;()« V/<0) = Etl. This means thai the 

f 

hood equation has ihe indicated form. The asymptotic variance ft, of the esti- 
mate is (/»/(#)) " ', where the information function is 



ae 1 \ s* e/o 



Specifically, in the model OTfr, 0) Ihe mean is *(?_)_= /tf/(I - 8), and 0„ = 
Jr/(»- + X) is the solution to the equation p(8) = X (this coincides with the 

result of Problem 2.84), while the information function is i(fl> = * W = 
r « 

- (the result of Problem 2.43). 

0(1 - 9) 1 

1.97. In this case/(0) = e* - l. „(*> = 0/(i - c~'), and the likelihood 
equation 9 = j(l -_ e ) cannot be solved exactly. In order to approximately 
calculate the m.lx. 0„, we use the accumulation method. Here (see the solution 
to Problem 2.96) 

U(x; ffi = — In Lix; f> = ^{2 - KW, *0) = i_-g + »-» 
3S 6(1 - e " •>* 

and the sought-for equations have the form 

8* + 1 = 0* + U(x; ff»V (i"'<«*)). * = 0, 1 

2.98. In order to write these equations (See the solution to Problem 2.97), we 

should know the contribution function Um = L ^ and the information 

SU(6) " 

function i«(8) = -E» i-A In this case 

36 
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■ ■I **>1 

— ^ — *« = - 2i ^ 

Ftf 
because E^f = /y>i<S) and 2j P'W " ' ¥S - 
i- i 
2.99. Since i(fl) = 1/2 (see Problem 2. 4 J), the iteration consists of 



ftt + l - fe + l(/{6*). * = 0. I WW 



= 2 y x '- e 

^—* 1 + (Xi - tf) 2 



We may use the sample median T, = X^^t-i) as a first approximation for 
8o, the sample median being a consistent estimator for 9 because here the theo- 
retical median coincides with 9. From Problem 1.32 we find that as n -» «> 

„4Sf.r„) - -/ (a, -—J — ) - ../-/«. ^-) . 

\ *nf(8\8)/ \ An J 



Consequently, 



erf (T„; fl> = JL = 0.8. 



2.190. We write the likelihood function in the form £(x; 6) ~ 1(6 Js .*<»))/*" 
(See the solution to Problem 2.78) and see that it is monotone decreasing in 
6 for 9 ~» Xw, i.e., it attains maximum at 9 - x ln y. Then 9„ = ■*"<„>. Using the 

solution to Problem 2.24, we find E,e„ = 9 ■ fl -. te, the 

» + 1 n + 1 

estimate £„ is asymptotically unbiased with D 9 <?„ = — — 9* — as 

(/i + 1)*(« + 2) 

n -» « and, consequently, $„ is consistent. The distribution function for X M 

was found in Problem 2.19, For t 3: we have 

h (^.<.)^(*.>.(i-0)-i-(.-i) <, -i-.-. 

i.e., the distribution of <?, is not asymptotically normal (the model is not 
regular). 

2.101. The form of the likelihood function Uye. 9) = I{9 + 1/2 £ *<„)) x 
Jifo,, > 0- 1/2) implies that £(x:0) = 1 V8 « [*<„> - 1/2, *«> + 1/2], Conse- 
quently, any 9 from this interval maximises L(n: 9). An arbitrary point 
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Te IX<*) - J/2, X m + )/2)can be written in (he form T = e4Xw - l/2) + 
— a)(A(i) + 1/2), a e [0, I]. From this we find the condition 

E«T = - - a + <*£,*!„, + (1 - a)E*X<i) = Vfl 

from which we obtain an unbiased estimator for 8. Using the solution to 
Problem 1.36, we get or = 1/2, i.e., T = (l/2)fjf (1 , + X M ) is the midpoint of 
the interval. 

2.102, Here the density is/<*; 8) = b ~ a a(x - 9}" " ' exp ( - b ' a (x - 9f ), 
x > 0, and the likelihood function has the form 

UK P> = JX /(*; 9} 

iml 

n 

= b-'" a "Hx U) s 8) JJ <* - »— ' exp j -■ *' ~ fl) " | . 

It is monotone increasing on the interval - «> < 9 £._ Xy n and is zero for 

> -T(i)- Consequently, it has a maximum at the point 6 — xw The model 

under investigation is not regular, but its m.l.e. 8. = X{t> is, by Problem 2.26, 

asymptotically unbiased and consistent. The asymptotic distribution of A'm 

b given in the solution to Problem 1.37 and it is not normal. 

it 

2.103. Here the likelihood function is £(«; 9) = 1-1 e ~ r/ ' X.L *" 



2 _. aini 



T = / k xf. and the likelihood equation " **'*" = ft has the only solution 
■ o8 

i-i 
Si, - T/n which maximizes L(x; fff. The model is a Special case of Weibull's 
distribution with an unknown scale parameter (see Problem 2.76). Therefore, 
9. coincides with the complete sufficient statistic and is an unbiased optimum 
estimate for 9. 

2.104. The form of f„ follows from the solution to Problem 2.84 and the 
in variance of maximum likelihood estimates. From Problem 2.21 we get 

t[, where r, = — is an unbiased estimator for 9 ~ '. We i hus 



have 



&♦„-_*=-«-'.= —- ' 



Xn - 1 (X/i — 1>0 

i.c., f„ is an asymptotically unbiased estimate. 
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Further, 



(*")- .-a e -1 _ 1>*(S))* 



(Xfl - 1) ! (>J7 - 2) X* m(fl) 

(see Problem 2/43), We see that f n is consistent and 






2.105. In order to maximize the likelihood function, we must minimize the 

n n 

sum 2 t» - *l = 2 !*■» ~ fl l- whence follows (hat ft, coincides with the 

■ - i i-t 

sample median. We cannot use here the theorems on the asymptotic normality 
of m.l.e's or sample quantiles because the density f(x; 6) is not different iable 
at the point 0, Nevertheless, we have -/»(£„) - !■($, I At) as was the case in 
Problem 1.32. 

2.106. We seek, the limiting distribution for the estimator T„ as n -* =■. 
We have 

P*<Vrt(F, - 9) « x) = p^^HHX - «) ^ x) + (1 - p„)P,{MbX~ 6) *£ x), 

where 

p„ = Pv(\X\ 2 a.) = *(Jn(0 - On)) * *(-Vn(( +- «„)) «♦ J 

as n — oo. Thus, 



for |0| > 0, 
for & m 



_ f ./(0, 1) for 

C . / (0, o 2 ) for 
id 



|S| > o, 

8 = 



as n — «>. From this we find 

for |6| > 0, 
crf(r n ; fl)= J. _ 

for a = 0. 

i ■£., eff (li; #) ^ 1 for !6| < 1, the strict inequality holding at the point = 0, 
which is, consequently, the point of superefficiency. 

2.107. We know that in the model fl(0, 6) (see the solution Co Problem 2.100) 
ft, m X&t and D«tf„ = 9(n ~ 2 ). In Weibull's model (see the solutions to 
Problems 2.102 and 1.37) 6„ = X m and D*ft\ ■ Stn -2 "") for < a <£ 1, i.c., 

•the variance of the m.l.e.'s may be as small as possible depending on the value 
of a. 

2.108. Here {sec the solution to Problem 2.R4) ft, = X and by virtue of 
the inyariance wc have_r„ = X' 1 . But we know (see Problem 1.39) that 
jQIjiX} = fl(nlJ)ai>dP»(A' = 0) = e ~" 8 > 0. Consequently, the random varia- 
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ble X becomes zero with a positive probabiliiy for any n, and the statistic f„ 
has no finite moments. At the same lime its asymptotic variance is 
[t'(b))V[hi<S)] - (fl'n)*' (see Problem 2M). 

2.109. The asymptotie variance of the m.lje. t„ is independent of the 
parameter S U and only if (he function t{S) satisfies the relation 
a%,9) = O'(0)) 2 //(0) b con st. From this and Problem 2.43 we find the respec- 
tive equation t'{8) = c/-Jd(l ~ 9) for the model Bi{k, 0), Us solution is 
■rifi) = arcsin \f? (up to a constani factor) for which (jJcfl) = 1/(4*). For the 
model n(fl) the sought -for f unci ion is the solution of the equation r'(0) = 
c/VS, i.a, t(6) = V? and crj(0) = 1/4. For the models ^{ji, 1 ) and r(», \) we 
have ihe equation t'(&) = c/B, i.e., i{6) = In 6. Thus, using Problem 2.84, we 
have the following useful approximationsr 

for the model fli(*. ff): >4(arcsin VX) ~.-t- ( arcsin \8, — !— 1 , §„ = X7k; 

\ 4kn J 

for (he model 11(0): -4<Vft7) ~ Jf{ V?, — ) , ft, = X~\ 

for fhe model,-/ (^ ^):^(ln §„) ~^{h>8, S-\ , $„ » i^j(;K - ^ ; 

for the model r(t», X): -4(ln ft,) - -jV f In fl, —J , ft, = A7\, 

2.1 1 1. From the solution to Problem 2.83 we find chat for (he given q = k 
(the number of elements observed in the sample) by maximizing the likelihood 
function in N Js k, wc maximize Che functions £(A; N) ~ C%N~" in N or, 
which is equivalem, the functions 
*-) 
/<JV) - J^ IniN - J) - n\n N, ff > k. 

Consider the difference 

4/VO =/(/v+ L) ~J(N) = in — ? * ' nln^" : ; 



N - t + I W 



[S(M *>- n]h»*il 



Here the function 5C/V, k) is monotone decreasing in N for it > 1. indeed, 

if we incroduce the function <Mx) = In f 1 - — \ fin ( 1 — I . 

\ N + 2/ \ N+ l) 
< x < M + 1, the inequality S(/v", A) > S{N + 1, *) will be equivalent to 
w(£) < *K')- The funciion ifi(x) is monotone decreasing because the inequality 
v'(x) < reduces to 

(N + I - *>ln { 1 - — i— ) >(N+ 2~x) In (1 - * A . 
\ N + \/ \ N + 2/ 
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This follows from the fact that the function \Ky) - (y - x) In ( 1 - - j , 

y > x, is monotone decreasing ( ^ '{y) = In f 1 1 + — < 1 . Thus, for 

the given values of n and k > 1 the inequalities S(N, k) ^ n < S(N - I, k) 
uniquely define the integer No = Nf>(k, n). Here $f{N) > for N ^ N> - 1 
and AANc) 5 0, 4f(N) < for N 3s No + t. This means that the function 
/(N) is monotone increasing for N ^ No and monotone decreasing for N > No 
if S<Nc *) * n. If S(No, A) - n, then /(N ) - .A (No +■ 1). Tb the left of No 
and to the right of N, + 1 we have ffffj < /(No). Thus, in any case 
max/(N) = /(No), q-e-d. Now let * = 1. Then g(l: N) = 1/N" - ' and maxi- 

N 
mum is attained at N = 1. 

We have N = »j if and only if there holds the condition S(f, u) S n (since 
S(i) ~ 1, ij) « co by the definition of the function S), This condition can be 

n + 1 

written as In (n + 1) € " in , 

V 
In the given asymptotic condiikms we find an approximate solution for 
a from the relation 

1 



_*(,-!*) „*.£*- 



= a. 



1 - e - 



Then -2 = «(&), where a(a) = . We use a' \i) to denote the function 

n « 

inverse to o(a) and obtain a = n" l (Tj/n). _ a 

For an arbitrary value of m we get the function g m (fc; N) = C^fCJ)" in- 
stead of the function g(A; N). We maximize the former in N and find that 
for ij > m the m-he. N m is defined by the inequalities 
5 m (N m , u) ^ n < S™(N m - I, a, 



where 

S, 



<* « ^ '"TTTT^r/'-TTTf^ «****>»<• 



s m (* - 1, ft) 



If tj = m, we have N m = m. 

2.112. If « = ft, then N is the value of N which maximizes the likelihood 
function 

P^j - *} « Ct,C%'-^jC^ a «<JV). 

Here g(Af) = (Af " ""** ~ "^ whence follows .hat the inequali- 

g(N - I) N(N - mi - m z + fr> 

ties g(N) ^ ,g(N - 1) are equivalent to JVJt S miWj. We write No = \mimi/k] 
and find that for N s£ No the function g(N) is increasing, while for N 5 No 
it is decreasing if the number m>*m/k is not integer, i.e., in this case No is 
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the point of maximum. If No = nii/ni/k, maximum is attained at two points, 
i«., No - 1 and No. Thus, the value of N coincides with Nt> in any case. We 
established In Problem 2.36 that (for n = 2, m, = m 2 *» m) the only unbiased 
estimator for t(N) = UN, which is a linear function ofuj, has the form ji^/m 1 . 

The maximum likelihood estimate is r = 1//9 =1/1 — I = — [ 1 - — J . 

/LwJ " 2 \ W 

where e = — - — I . Clearly, it is a biased estimate. 

2.IJ3. (1) We first prove that d„ is a sufficient statistic with the help of 
the factorisation lest, i.e., we show that the representation Pjj(X = *) = 

gid„; D)h(.x), d n = S xt, is valid for the likelihood function Po(X = x). x = 
1. 1 

(Jfi JC»). Xi = 0, I, i = 1, . . ., n. Using the formula for the product of 

probabilities, we may write 

Po<X = x) = Pot*, = x,yP^X 2 = xz\X t = x,) 

X , . .PciXn = X n \Xi = X,. Xl~ Xl, ..., X n ~, = *„- l). 

We then have 

PrtXj = xJX, = xi, ..., X,- x = *,_,) 

for xj m 1, 
i f0r ^ = ° 

= (P ~ Xl - ... - Xj. {f\N - O - j + 1 + Xl + . . . + Xj- ,)' -"/{N - j + 1) 

and 

PMX - x) = D"iD - x,y. . .(D - xt - ... - a*. ,)*-(N - £>)'"" 

x ( W - Z? - (I - Jn))'--" 

X ...<N- Z) - (n - I - Xi ~ ... - x „- l )) > - x '/lNh. 
Using induction on n, we write the formula 

M"iM-ei)°...W-ei- ... -&,_,)«■ = M[M- 1). (m- 2> v + l") 

= mi f m ~ 2 $/) !. 

where £? = 0, 1, which finally gives us 

r*x = x) = J!(w-0W(w-,)i cd-4 /£ o 

<Z> - rf,)!(JV - Z> - n + d„)lN\ N -< "• 



V 


- Xi - 






- Xj- 


I 






N 


- j 


+ 


1 




1 


D - 


Xi 


- 




X). 


1 
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We show that d„ is a complete statistic We must then prove that if 
E w (rf„) = for D = 0, 1, . . . , N, then *<*) = 0, k € |0, f, . . . , min (A ") ] 
for all possible D, i.c, for k m 0, I, . . ., n. Since Jhtft„) = H(D. N, n), the 
previous condition has the form 
iniMfli ft) 

J ^flC^CST-VCI, = 0. D = 0, I N. 

We put XJ = 0, £> = 1, etc to obtain io(0) = 0, #<1) = 0, etc. 

Now let t(D) be the given estimated function of the parameter D. Its opti- 
mum estimator is the solution of the unbiasedness equation 

EoTW.) = HO, D = 0. I N. (*) 

For any function r the left-hand side is a polynomial in £> of degree *£ n. 
If i<D) is not a polynomial of degree <f|, then the equation {*) has no solu- 
tion, and there are no unbiased estimators for such functions. 

Finally, we assume that i{D) is the polynomial from the statement of the 
problem. We check whether t* meets the condition (*). We have 

Ep*-"- y) 7W<*;0, *) = 5j <"TT S «W*»<«* 



Here 



Therefore, 



2 <£«/<*; A «) = SWA)/ = WMWj- 



•j 



E*f • = S °X^)y = rim vd. 

Consequently, t* is an optimum estimator for -KD) as a function of a complete 
sufficient statistic 

(2) The function iAD) is a polynomial of degree one, Therefore, its unbiased 
estimator always exists and has the form t-j ■ Nd n /n. 

The function n(J9) = (N - 1)0 - (D)i is a polynomial of degree two. 
Therefore, it only has an unbiased estimator for a sample of size n > 2. From 
the above wc find that tj = d„(n - d,Mf*h/(n)i. 

(3) In order to find a m.l-e, for the parameter D. we must maximize the 
quantity g{d„; D) = C°Z d n -/C%. But 

g(d n ; D + 1) _ (£> + l)(AT - n + d„ - D) 
Sid„; D) (D + 1 - d„)(N - D) 

whence we obtain j<A; D * 1) £ jftb; 0) for D S — (/V + I) - 1. If the 
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number — (N + I) is not integer, the maximum 15 attained at the point 

°" " I ~n ^ N + 'M ' ^'f 1 "*'^' maximum is attained at two points, i.e., Dc 
and A, - 1. Thus. D~„ = Do in any case. 

2.114, The likelihood Function of the/jth sample has the form 

L t = ^ *,, * = -_£__ exp £_Jj^ + «* _ ^J 

(see the solution to Problem Z£6). Since the samples are independent, the likeli- 
hood function is 

TT * 

L = II ^ - (2«* i )-: /i ex P f-_L J] „ j<Xj _ ^ 

-GGH- : i)} 

for all of them. Here f 1 = I J>] »/*?. » = ». + ...+ «,. The power of 



j'-i 



the exponent is non-positive and becomes zero only for 9j, = xj,j - 1, 
k, fli = t. Thus the m.l.ds of the parameters have the indicated form. 
We now have (see the solution to Problem 2.1) 

/-' y-i 

Consequently, the statistic — — - rfj is an unbiased estimator for fl| 
a — k 

2.115. Taking into account the choice of c,, we have 

y - V,{&\X- $\/9 < c T ) = p^«i _ ^/VS) < ^ < fl(l + ^/Vn)) 
■ P»(A7(1 + c T /Vn) < < ^f(i - c,/vn)). 

For the model with a negative parameter (he original equation becomes 
y_= Pe{-/n\X - $\/\9\ < e,), whence [he soughc-for interval is (.¥7(1 - c,/v£) 
A7(l + c T /Vn)). 

2.116. (1) Here -^(GfX; 9)) = ^(0, I) and G(X; 6) is a central statistic 
We get 

Pftei < C(X; 6) < gi y = *(gi) - *G?,) = y , 
and 7*i - A" - — gi, 7V = X - ~ g, are the solutions to G(X; fl) = gll g2 . 
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Thus, the confidence interval Ay(Jt) has the indicated form. Its length is 
/ e -f- (j : - gfi, and therefore in order to construct the shortest interval, we 

should minimize the difference g* - gj under the condition 
4*(ii) - *(gi) ■ 7- * e usc Lagrange's method for finding the conditional ex- 
iremum. We form the Lagrange function 

H(gi, Si. k) ■ 82 - P + *Wft) " *ttt) ~ T> 

and equate all its partial derivatives to zero to obtain the system of equations 

* 'teO = * '(&). *(£) - *to> ■ > Suiw lhe toBten * 'W = -= e"' 1 "* 

is even, the first equation gives £i = -gi . Taking into account the second equa- 
tion and the relation *( - x) = 1 - *Or), we obtain 2*(gi) -1=7, whence 

gx = Gp 

(2) The length I of the confidence interval iJCX), the confidence interval 

lac., 
y, and the sampte size n are related here as I = — — . Then for the given t 

vn 



and 



and T the requited number of observations is n a n(l, 7) = I 4 —2- i , 

for the given n and I the confidence level is v = -f{n, 1) a 2* ( — J - 1, Spe- 
cifically, c„ „ = 2.J758 and n = 106 for < = 0,5 (at q = I), while n m 2653 
for I = 04. 

2.117, The distribution density of the random variable T/e, which in our 
case is a central statistic is 



HH -.*<SH -»*•* 



Therefore, 



Frfl I 6,<X» - P. ^ < I < »\ B 2 \ **&>) & - 7 . 

Tbe shortest of the intervals under consideration is found by minimizing the 
ratio «*/„, under the Momoa j" xkJ) ^ dx m y/2 The mefkoi ofLagrange 



multipliers gives 



•1 



J 
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We write a? = xj,,«, "I = Xi-oi, n and reduce the above equations to 

Xt-oj.^xi.n = «P J-<xf-^,-i - Xa,,/t)j. at +• at » I - Y- 

These relations uniquely define orj and «J and thus x*- , x?_ ■ 
[7, Table 2.3 J. Thus, the optima] interval has the form ' ~ 

sr,m = (7-:. n) = w Xl _ a - n , T /^ n ). 

2.1 IS. Here the central statistic is C(X; r) = 7*/t, t = 6 1 , and the assertion 
follows from I he previous problem. 

2.119. From the hint we have P e (Vn_- l(X - 0,)/ S ^ i y .„ -,) = y, whence 
we find the lower -(-c onfiden ce interval {X - *.,.„- t 5/V« - 1 < Si). Similarly, 
the relation Prf-Jri - IjX - 0,)/S 2 - ty,„-i) = y gives the upper 
> -confidence interval (0j_< X + (,,,. i SA/n^J), The central two-sided 
-/■-confidence interval is (Jf ■»■ Hi„)/j.,-iS/vnT[), Tn JL' s the shortest in- 
terval among all the -^-confidence intervals of the form iX - «iS, X + OxS), 
which as proved similarly to the assertion from Problem 2.116. 

2.120. Here /iSVr is a central statistic and 

(yiS 1 /*?, + ,,„., „ , , nS*/ x } t _ T)/I . „ . , ) 

is the central y-confidence interval. The lower and upper 7-confidence intervals 
are (nS J /x£„_i < 0\) and (fl| <_n5 3 /xi^ T ,„ . 1). respectively. 

2. 122. The sample means -V and K are independent and normal as 
''(^'^ o\/n)_aad^ I (6 a \ o\/m), respectively. Therefore, S\X - Y) = 
t (f. " ), i.e., {X - Y - t)/o is a central statistic for r. The -(--confidence inter- 
val is sought as in Problem 2.116 and has I he form (X-~Y^ cVef/n + o\/m). 

2.123. We have /I 



y \ji sl < x H - **<" - D, ^'(-^- S*(Y)j = x 3 (m - 1). Since the sam- 
ples are independent, we also have 

^ \F tn,SI(X) + mS;i ( Y >]) = aft* + "-=)■ 

Here the random variables A^- 7 and «S 2 (X) + mS^Y) are Independent (by 
Fisher's theorem). Then jr-(r mHf „_ 2 ) = S{m + n - 2), and, consequently, 
tm*H~i is a central statistic for T. The respective ■y-confidence interval is con- 
structed as in Problem 2.119 and has the form 

{ X- 7"± r« *■,>/!.,, ♦*- 1 ™ * " - (nS J (X) + mS^V)) ' ) . 

\ I mn(m + n - 2) | / 
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2.115. By Fisher's theorem we have -/"(nS^Xytfi"! - x\" — *)• 
y\mS\\)/9? f ) = X*(m ~ 1). ahd s2 < x > and S *< Y > are independent. Conse- 
quently, ./(/■„- i,«,-i) = S(n - 1, m - l>. Then the centra] -y-confidence in- 
terna! for t has the form 

VK" - 1) 5*00/ *"C" " s < Y V 7 

|.1». Since ^%2nJWi} = x J <2">. ^%2mY?S 1 ) = xH2m), we have 
S{tX/ Y) = S(2n, 2m>. As in the previous problem, we find the sought- lor 
interval 

(rVl-Tj/l.Tn.ImV/Jf, F t ) ^Ijrl.Zn.TlinY/X}. 

2.128. Since P«<Jf tl > >.*>« HM& > Jf) = c - *** - *' for * S S, we have 
P,«0 « J*<t» «£•*■■+ S) » I -.•""*- y for X = — ,n < l ~ T>« '*■' ,he 
sought-for interval is 



( X w +iln(l - tJ, #»>J 

t> = J?(0, 1), we have ^i(X^y/S 
^ 1 we have 

9,((x m /ay ^ i) = Ptix^/e s i l/ ") = t \ x— l <tx~t, 



2.129. Since >if(Ai/9> = J?(0, 1), we have ^(X^/S) = «C«. D (see Problem 
1 35). Then for <( € 1 we have 



Whence 



P.(A<»> ^ » ^ ^ (n) /Vr^^) = MWWV > ' " T> = 7- 

2.130. Since .jS(2JVff*) = x*(2«). we have 

Plt^l-Tl/i,2. < —j=j- < X{1 +■»/!, HI ) = 7. 

which is equivalent to our assertion. 

2.131. We have 

P»((Bi. r>« SOQ) 
= P,(vn|^"- Bilfc" 1 < Cy„ xh~y,irt.*-i < mSVbJ < x fi **&%.*- >> 
= P»(Vn[Z- eilfl! -1 < c Y ,)P»(jt(i -«%■■ i < «SrV*8 < x'l +-m)/i,«- i> 



A + 72 I - yA 



- (*{C V! ) - *(-C„» 1 — ^— I = WW - T- 

2.132. According to Problem 1.59 (b). the quadratic form 



q = .. " [J- (jf, - e,) 1 ^ _2£- (x, _ e,){X 3 _ «,) + -L (A - ft) 1 1 

i - c* L»f 



fl|Ol Cl 
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is distributed as x J {2) for any 9 and E. Therefore, 

7 = P<p<g « x\.d = P (» € ,^<X», 
where 
-^{x> = (ft g = "(A 7 . - e,, Ai - 9,r £-'(*, - 0i, ^ - so * x* mI ), 

Thus, ^(X) is the soughf-for ■y-con fide rice region for 9 = (6,, Pi). This is the 
inside of an ellipse with centre at a random point (X,, Xi) whose boundary 
is defined by the equation Q — xl.i- 

2.133. Since JUnT) = Bi(n, 9) (see Problem 1.39 (3)), the random variable 
T assumes the values of 0, \Jn, 2/n, . . . , n/n and its distribution function 



<H-s 



c;e r (i - &)" 



is a continuous and monotone decreasing function in 9 (for it < n). viz., 
^ (n : *) ° -" C ^> a "< 1 -»)"-*-' <&. k<n. 

Consequently, the sought- for interval is defined by the solutions of the equa- 
tions FiV, 6) = 1 - f\T - 0; S) = (i - 7 >/2, which are just as given in (he 
Statement of the problem. The expression for the boundaries T, , 7* 2 in terms 
of Ibe quan tiles of the beta distribution follows from 

' p 

If the number n of observations is lar^e; then we may use the asymptotic 
theory of maximum likelihood estimates to find the approximate confidence 
interval for 9 quickly, in this case the m.l.e. is P„ = X(see Problem 2.84), and 
the information function is itf) = IflQ - 9)] ' '(se e Problem 2.43). The sought- 
for interval has the form (Jf ± c-,-JX{i - X)/n). 

2.134. For large n we have 

y m P,(vn|* - 9|/V9(1 - 8) < c-J = P, f (X - 8? < ^ 9(1 - e\ 

= P«((9 - 7*,)(9 - 7j) < 0) = P,(r, < 9 < Tj). 
where 



?"!. 2 - 7"l 



'»- (*♦#•> -*>?♦£)/(■♦*)■ 
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Thus, (7\, Tt) is the sought-for approximate -y -confidence interval. If we 
neglect i he terms of th e order l/n„ then this interval is reduced to the interval 
(X ± c^X(\ - X)/n) obtained in lha previous problem. 

2,135. Since ^(2\/n(T<Al - H$))) - . *'(0, l>, for large samples we have 

7 = p,(2v7iwS) - ne)| =; c T j = p e f i(x) - -^- < 7{e) < t(x) + -^ J , 

which is equivalent to our assertion. 

2.137. The solution is similar to that of Problem 2.133. Here 
S\nT) = n(n0), and 7" assumes the values of k/n, k = 0, 1, 2 Its distribu- 
tion function 



rmtt 

loione decreasing in 0, i.e.. 



is continuous and monotone decreasing in S, i.e. 
F 



Therefore, the sought-for Interval is defined by the solutions of the equations 
e\T; 8) = 1 - F{T - 0; S) = (I - y)/2 which have the indicated form. The 
expression for the boundaries T u Ti in terms of the xj.t-quantiles follows from 

t-t 



P(XL>W= ^J e 



where y(x») = *£**>- ThLs can easil > F be ve " fied - 

For large n (by Problems 2.84 and 2.43) we have the approximation 



*Qi*-*) -■'*■» 



for 6» = X, whence we find jhe approximation interval. 
2.138. In the first case we have 

y at Pt(Zvn|>/7~- i/»\ < c y ) 

= P,(max(o,VF-^)<^<VT + ^) 

Whence follows that 

([~(».vr-i)]',(vr^)') 
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is the sought-for approximate -^-confidence interval for ft Using another ap- 
proximation, we obtain 

7 = P*Sri\X- *|/vfl < c») = *ti(X - fff < <*B/n) 

= Pjf e 1 - 7»Vx + ^\ +^ < oj = P B <T, < S < 7a), 

— c 1 l—c 2 eT 

where r,,j = T,.,(X) = A" + — - T /X*^ + _ L, 
2n -^ « 4« 3 
Thus, (7>, Ti) is also a ^-confidence interval for ft_ 
These two intervals are equivalent to the interval {X ± c y >Jx/n) of the 

previous problem up to the terms of order Vn. 

Z.140. Using the result of Problem 2.96, we find the sought-for interval 

(ft, ± c T Vfti/«/i '{!?»)), where ft, is the solution of the equation ^(0) = X, and 

^(8) is the theoretical mean of the distribution. For the distribution Bi(r, 0) 

the interval is 



\r + -X ■y M{r + Xf/ 



2.141. The sought-for interval is (ft, ± c/Vm'Cft,)), where /(() = V0 1 , 
ft, = X/\. As a result, we find the interval J0i _l (I * c-,/VXn). 

If we use the approximation _^(>/X/i(]n ft, ~ In 0)) - ~-T(0. 1) (see the solu- 
tion to Problem 2.109), then we will have 

y = P»(V\n~|ln ft, - In pj < c T ) = p,( | n ft. - _±L < l n fl < l n ft, + J2_^\ 

i.e„ (x\-'t-^^, XK''^^) Is another approximation of the 
7-confidence interval which coincides with the first one up to the terms of 
order \/n. 

2.142. By Problem 2.109 we have 

y = P.(% / 2n| In ft, - ln 6\ < ej 

where ft, = - ^j (A", ~ nf) This interval is reduced to ft^l ± Cy/v'S') 

i- i 
if we neglect the terms of order 1/n and use the standard approximation 

^(ft.) ~ I (e, -~-j (see Problem 2 88). 
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2.143. The result of Problem 2.87 implies that the sought- for interval based 
on the standard approximation for m.l.e!s has the form <?,, ± c y 0^8*)/vn), 

where t„ = * ( *° ~ -—\ , «* = ffi., fc.) = (X S), o?(») {see Problem 2.87). 

2.144. In this case the model is defined by the {N - I J-dimensional 
parameter 8 = (pi, . .,,?)*- i) (see Problem 2.63) and the likelihood function 
of the sample X ■ (Xi , . . . , X„) is 



£(X; 



«= n^ 



- «P I / i "J ul ~ — — + « In (1 - Pi ~ ■ • • ~ Pn- f . 

^-t— ' 1-pi -...-Av-i J 

where pj is the number of units of the sample X, which are equal lo oj, j ~ 

9 in L 
1 N We now find the solution to the likelihood equations — - — = 0, 
' aPi 

j = I, .... N - 1 (i.t, the maximum likelihood estimates for the parameters 
pi p w _ i>, which has the form fij = vj/nj = 1 !* - 1. The informa- 
tion matrix I(tf) for this model was found in Problem 2.45. By the asymptotic 
theory of m.l.els we have ^(v/ntf, - «))-J(o,r '(ft.)) as n - » and, 
therefore, _4(G„(»)) -* x\N - 1) »s * -* «, where the quadratic form is 

N- I 

Q„($) = n(#. - «'l(A,)(ft, - 9) = n S(A- Pr)(A - p,)/p* 

W-l (V 

ral *■■ I 

Whence as rt -• «= 

P.f S <"' ~ "P') 3 '"' < X?,*-iJ ->■ 

This means that the sought-for asymptotic -y-confiderice region for the 
parameters p , pn has the form 



•*yx) 



= J (Pi, ■ - -, Pl>* S ("' " »Pr)V»r < X*.*- ( , < pj < 1, 

i= 1, .... W, S P'= '( 



14— SB') 
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and is an intersection of the inside of the /V-dimensional ellipsoid 

< X>.s-\ 

t7_ 



with the hyperplane pt + . . , + pn m 1 within the zone < pi < 1, / = 
1, .... N. For N = 2 we obtain a result similar to thai of Problem 2.134. 
2.1 45. By Fisher's theorenijsee also Problem 1,36) A" and S 3 are indepen- 
dent, whence follows (hat X- X*+y and S 1 are abo independent. But 

jftT- AW,,) - Wo. (Jf^i) and ^(~j = xV - 1)- We 



then find Student's ratio r„. 



i ■ 




_ / L - I |JC- A-. + 1 | \ 

= P»i -V - f (M . T y,, n _[5 / < Xn-t-l < X + h\*-i>fi.n-\S J -J, 

qid. 

2.I4& Bor the given data we have! = 4.196, s = 0,226, (o»rj,j =■ 2.776, Conse- 
quently, the required interval is {3.43, 4,96). 

2.147. Fbr large n we have 

7 m P V~j=^ * *) = *«* " n) 1 ^ 3«§ 

= P(V - 2n(£ + ^) + f J S0) = P(m«iK flu), 



where /ii,i o n, rl ({) = £ + c$ * CjV2J + cj. 

Sinoe Co.» = 1.645, the sought-for interval ts (131, 1S9). 

TO CHAPTER 3 

3.1. We have two groups with frequencies Aj = 2048 and ftj = n - h t = 1992. 
Here Ihe null hypothesis is ffa p =. q = 1/2, and the expected frequencies are 



y"* <Ai_- n;Xr 



np = nq = n/2 = 2020. The test statistic is X\ = /, = 0.776. 

i-\ 
For large n this quantity is approximately distributed as \ with one degree of 
freedom. From the table of quantiles for the ^'-distribution we find xo.». , = 3.84, 
xlt.i ■ 271. Since 0.776 lies within the 
are compatible with the hypothesis Ho. 
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Sit ***» "^3 
— ' — - 11.13 is compared with ihc 



critical boundary xln.t " 5.99- Since 11.13 > 5.99, (he hypothesis Hf> is rejected. 

3.7. The expected numfcter of readings in every interval is npi - 100/12 - 41.67. 
The test statistic is X\ = 10,00, which is smaller than the criticai boundary 
Xo.ft.ii - 17-3. i^. *e agreement is good. The hypothesis H e is not rejected for 
the significance level a < 0.55. 

3.8. The test statistic is Xi = 0.47, xh.i - 6-25, i-e., the agreement at a ^ 0.9 
is good. 

3.10. The boundaries of the intervals are sought from the equations 
1 - c-* c 1/4, e - *> - e-*" 1 = 1/4, j - 1, 2. We have x t = 0.288, xi = 0.693, 
x> = 1.386. Grouping within these intervals gives the frequency vector h - (9, 9, 



>*- S r: ^' 



17, 15). The test statistic Xy, - 7 , — — ■ » 4.08 is smaller than the 

*— * npj 

j-< 
critical boundary xa.9,s — 6.25, it, the hypothesis H is not rejected. 

3.11. Since P(f k x\H a ) - I - e"*", the probabilities />/») = P(HAr1"o) 
here betome 

p0) ^e-"-^\\ - e-"\ j = 1 N- I. P*(« bi"*-**, 

and the equation for finding the multinomial m.Le ft, p, p. 150] has the form 

f; *jpmtm - s m. - 1 - *-"Ve - e ~*"> + w - «** = o. 

/-i y-i 

We write j = e~"'* and find 

j.i / ;-i 

Consequently, the m.l.e, is 

tn= ( ltjfy-«)( SA-M, 

and the respective estimates for the probabilities pj(fl) have the form 
fy m & '<] - fcfc i - 1 /V - 1, jS v = £%- '. 

According to the general theory [7, pp. I49-150J, if the conditions hj £ 5, j ~ 
1, . . ., N, hold for large n, then the respective x* goodness of fit test rejects 
the hypothesis Ho if and only if 

X 1 . = S ( fi i " "A)V(nA) S xf-.jv-i. 

where o is the chosen significance level. 



Z12 Answers and Solutions 

For the data of Problem 1.21 and the given choice of the parameters N 

h z + 2k-, 7 „ 

and a w« have A| = 28, hi = 16, fij = 6, &, = = — , and X„ = 

2*r - A ( 18 

1.96. . .. Since xd?.i ■ 2.71, the hypothesis fie is verified by the data at the 
significance level a < 0.1. 

3.12. We are dealing with a polynomial model with N - 4 outcomes and 
probabilities/?!, . - - ,P4 which under the hypothesis //a have the indicated farm, 
i-e,, are functions of one unknown parameter. In order Co estimate $, we must 

4 

solve the equation 2 f^pfff\lp0i - 0, which in our case has the form 
i-i 

A] fiz 4- hi hi 

. . + _ = 0. 

2 4 9 1 - d 9 

This equation is reduced to 

vm - ni 1 + <fu + 2hi + 2h, - h L )$ - 2/n = 

since *s(0) = -2*4 < 0.-*<l) = 3(fti + Aj) > 0, the latter equation has the 
only root A, in the interval (0, 1). Consequently, the x 1 goodness of fit test 
rejects the hypothesis Ho at the significance level a only in the case of 

J-' 



ft?/«2 + e"„» + (ftf + A|)/(n(l - A,» + fti/(n^) >(«!...! + «)/4. 

3.14. Wc use the x 2 goodness of fit test. The estimate for 6 is 

(i = 5j '*< - 3.870. We calculate the estimates for the probabilities 

in 

fit = e"' — , i = 0, 1 10, and (he value %\ = >. = 

n *— * ha 

13.05. We have * = 9 degrees of freedom. Since xo.«.» = 16.9 > 13.05, the 
hypothesis W is not rejected. 

3.15. Here 6 = 1.54, Jp* = 7.95, ft = 6, xi.s.s = 10.fi. The data fit the 
model. 

j. £1 

3.16. Here S = 0.928, A = e - ' — , ( = 0, 1 5, £* = 2.172, ft = 6, 

f] 
xo.m.4 = 9.49, The data fn well, 

3.17. For -i*XO = fl/(2, 9) Che probabilities of the outcomes are 

Pd9) * •"« = O) = (l - O 1 , 
/>2(B) - P(, - 1) - 20(1 - fl), 
/>](•) = Ptt = 2) = e 2 . 
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Wc Find the estimates for the parameter 3 from the equation 

The estimate is 6. = (*i + 2ftj)/2n. 

Here Ai = 476, A3 = 1017, Aj = 527, n = 2020. Therefore, 0„ a 0.513. We 
have 

Xt - S (A/ - npAfof'toptf*)) = 0il6 
j-i 

and compare the result with x?-„.i- Since xo.j.l = 0.148, the hypothesis is 
accepted for any significance level a sj 0.7. 

3.19. We substitute p = p ( "* = (pi", -■-, />#') into the formula 

EMfilp) = « S c» - rflVrf + E «o - »Mr? 

fir N 

and take into account that 2 P? = '» 2 ft = 0. We then find 
/-J y-i 

B(Ari|p*") ■ W - 1 + 2 #fe£ + OQAtSi, 

J- I 



Since 



*.- s *^- E '"(^-jS?)*- 



j-i 

we have 



j?«= i +^ 2j «***+<*'""'* 



«i, - 1 + 



2] # /p ° 



/?„ = JV + 0(1/ vn), Ru = JV + 0(l/vn). *„ = 0(1). 
Using the formula 

n 
+ 2 !Lll (3R M - 2/Sl*.! - fill) + i <*« " *? .). 
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we find 

DlJfJlpW) = 3(Af - 1) + 4 2 flfrf + OO/VS). 
3.20. We have 

(1 -pfV = exp |„[n(i -pf>)| = ex p l-„ p f> + 0(n-')) 

- cxp [ -g - c &;/« l/4 + 0(rt"'j) 

- e-«{l - e A,/ n ""' + 9 %f/3tfn + <J0» "*">}. 

By substituting (his expansion into EOiojHj"*) = ^ (] - / ,j">)", we arrive at 

the sought-for expression for the mean. 
We use the formula 

ft 

»»=2Si(i-«- pjr - (i - p,ro - pj)-] - S (« - pj> 3 " + e w 

to investigate the variance. The asymptotic* of the second sum under the 
hypothesis t/\ i* found as above and is equal to Ne' 1 " up to the main term. 
Let us estimate the general term of the first sum. We have 

- A - pj? - (1 - p,)"(l - pjf * (1 - p. - pj)" - (1 - p t ~ m + p ,pjf 
<= exp ) -n(p, + p/} -~ t» + Pj) 2 + 0(JV -a ) I 

— exp 



| - »(p, + pj} +■ npjft - £ {p, + PJ f + 0(/V* 2 > 5 
j -"(Pi + W) - j (A + Pj) 2 i 



= exp 

X [exp (OfA/- 1 )) - cxp [ma-py + CKN*m 

= ~np i p i e-''P , -"PJ + 0(N' ! ). 
The first sum can be represented as 

-2" 2 P/Pje~ ""'-""/ + 0(1) 

(» \i « 

S P/e- n «' J + n 2 pje-^ + O(l). 
j-\ / 7_i 

JV 

Here the second term is 0(1), while J ^""'se"' + o(l). Hence, the 

/"' 
entire expression is equal to -/Vpe"" 2 ' + 0{V). Then 

D^j/Vi-') = -JVpe- 2 " - JVe- J ° + /Ve"' + OfA 1 " 2 ), 
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3.21. We apply the j^-test for uniformity, The test statistic is 

I 

X 1 = mm ?< (>-(i/«i - v^ni) 2 = 2.18, s = 4, k = 2, 



1- i 



and the critical boundary of the test is xJ-„,<,-n<i- i> ™ *»».:> = 6 ' 2S - Thcn 
Jfi < X«9.j. and the agreement is good. 

3.23. (1) Consider the difference Ao = vy - v>,e./n. hj = I, 2. We can 
dnectly check that all the sums An + A,->ind A„ + A w are zero. For example, 

" i f -t v i"i- 

An + Au = fn + ^n - — — — 

n n 

= i>,. in, i + f-:) = «-, — "i = 0. 

n 

Thus, the absolute values of the four quantities Ay are equal and, therefore, 



2 



4 .» V i ■'« 






We then have 

nAu = i 

and, finally, 



(2) Note that the random variables » n and ea are independent by the state- 
ment of the problem, and for some p £ (0, 1), J = I, 2, we have j^i-u) - 
Bi(nj, p) if the hypothesis H a is true. For n„ m - « we find by the De 
Moivre-Laplace theorem that .Sivy) ~ s!'(njp, njpq), q = 1 - p, or 

j^rij/Hj) ~ rip, — J , / = 1, 2- Then 



Thus, under the hypothesis Ha the random variable 



An vti\ \nir>2 
\n, mj^npq 
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is asymptotically normal as 1 (0, 1). Since 
Z, 






— -p under Wo, we have I — 1. It follows that the limiting distribu- 

" "V ""lo- 

tions of the random variables Z„ and f nln} coincide for the hypothesis Ho. 
q~e.d. 

Finally, under the given alternative the mean value of the difference 
Pi I VI2 

— ispi - p t > 0. Therefore, while 'testing the hypothesis Ho against 

the alternative //, , the critical region should be chosen as {Z„ > < a |. Since 
P(Z„ > r„|//o) — *( — to), at the significance level at the critical boundary has 
the Form 

In - -*"'<«) - **'d -ej« «i-„. 

Remark. If for any hypothesis defined by the probabilities pi, pi as 

it, ni — * «, we have 



y 



/mi fi 2 \ / /»i9i />i?l\ 

I — - — 1 - - ' ( pi - Pi' — ■*■ — 1 . 

\fli m/ \ «i ni J 



then, by reasonin g in th e same way, we may show that for the close alternative 
Hf i (pi - PiV^PiiH ■ o/vn, a 7t o, 

^(2«|ffl") ■* ' (W-y(J - ^), J), y = lim — . 

This allows us to compute the power test for such alternatives. 
3.24. Since (see Problem 1.54) we have 

S{» , fwlfi + ... + vnt = ») m A/(n; pi, . . ., p w ) 

forpj = 8/1 Zj ft,,/ = 1, ■ - ■. N, (he hypothesis Wo is equivalent here lt> the 



rft-%/ S ft.. 



hypothesis about equal probabilities of outcomes in a polynomial model. Con- 
sequently, the x 1 goodness of fit test has the form 

Remark. We use 7. S 1 = — ^ j {vj - » > 3 to denote the sample mean 
and variance and write the statistic X\ as Xi = NS 2 /?. Under the hypothesis 
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H a the theoretical mean and variance of the observations are equal and, Ihere- 

P p 

fore, if Ho is true; then we have sV? -* 1 as « — » or JfJ - M 

3. 25. (1) Since the joint distribution density of the order statistics X m 

g{Xi, , ■ . , Jtn) = «!( < Xl <: JS! £ - . £ Jt- £ 1 

(see Problem 1.31), by the formula for the total probability, the unconditional 
distribution of the vector* = (xi *«.i) has the form 

P(xi = ki. i = I, . . .. n + 1) 

= ^___ [ it ** - *,. ,>*'£(* — *><*»... d». 

j-i 

= ^ [d Xi \ dxt... \ [J to-A-ilr*** 



*•- 1 j- I 



(here it, + . . . + *„. i = m. j* = O.Jftti * i). Integrating this with respect 
to x„, x„-i, etc, and applying the formula 



3 

s 



^-^-^^FTFrnF* 1 -* ' 



we get 

m!n! *„!fc„n! 

P(m = kt, i ■ 1, . . . , n + 1) = 



*■!.,. *„«.il <*„ + *:„♦, + 1)! 
*„.,!(*„ + *„„i + !)! *i!(fc + ... + fan + n- 1-)! 



mini 



' — (t-FM +fl) 



(m + n)! 

On the other hand, for *i + - . + *»+ 1 ■ * and arbitrary p € (0, 1), 
4 = 1 - p, we have 

P(& = fc, / = 1, . . ., ft + 1|6 + ... +&, + [=*) 

P(fr = *i, i = 1, ,.., n + 1) '■' _ (C; j-, 

*«, + ...' + $„♦. = w) C.A"' 
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because _^<£ t + ... +£,,,) = Bi[n + I, p) (sec Problem 1.39 (5)), Thus, if 
the homogeneity hypothesis H is true, then 

S(,x) = ^\k , £,*,[{) + ... + 61 + 1 = *>. 

(2) We calculate 

*(W, *t = *) = P f S /«( = 0) = ftjf. + ... + fc +l - m j 

= *( S W = 0) = A. ti + . - - + fc*i = mjk + . . . + & * , = /*). 

in order to find the numerator, we write the numbers of the zero values. This 
can be done in rt + i different ways. We write 

C„* + ,P(f, > 0, i = I, .... n + 1 - fr. g => 0. j = n + 2 - k> .... n + I) 

x P«i + . . ■ + 6,^1 = m\tt > 0, i = 1, . .., tt + 1 - *-, fc = 0. 
> = n + 2- *, ..... n + 1) = C„\ lP " +| -VP{fl * ■-. + f,»i-* = FM) 
(see (2) in the hint). For r = 1, 2, ..., we have 

Ptfi = r| = P(f, = r)/P(f, > 0) - £-2 = y-v. 

/> 

.*.. •(£) = ^ + ]). Then y'(f, + ...+£>. y«, + ... + {, + j) 
and, therefore, 

P<6 + - . + & = m) = P(£, + ...+£, = „,- y> = CJt*i <7 V" " '. 

We finally obtain 

and 

-^{Js(fi, ffs)) = //(/t + I, n + m, n), 

Using the formulas for the moments of the hypergeomeiric distribution, we 
find that under the hypothesis H a 

tioln, m) = — ■ — — , Dsa(n, m) = — — — — . — — . 

n + m (n + m)\n + m - I) 

If n, m -» oo so that ot/ji = c > 0, then 

Esoin. m) = — ^_ + 0(1), D*,</i, wi) = — ^ + O(l). 

1 + S (1 + (,)' 



To Chapter 3 219 



(3) The formula for P(J (n. m) = k) given In the hint follows from (2). 
Lei k - (n + l)p + rJnpq, \t\ sj c < to. Then 

v2irnp<j 

and n - k - (m - l)p - — <4npq. Therefore, 
vg 

bln - k;m - l , p) = ±±M*-' 1 '™ 

•Jlirmpq 

Since n = (« + m)p, we have 

1 + o(t> 



fr(/i: n + m, p) 



and unite these estimates to obtain 



P( J0 („, m) = *) = I "? B-rtl**^! + «(»). 
■\ iTtmnpq 

This means that for A- = -~ ' + W/j e V(l + «)'. M « « < "=< wc wifl set 

1 + Q 

V2irn e V(l + e)* 
i.t, we arrive at the normal local (and hence integral) limit theorem. 

n * L 

EW», m)|//,] = S p <* = °l w '>- 
i- 1 

We first investigate the terms for i - 2, . . . , n. The probability thai the block 
B, is empty for fixed X t ,. n = x v < X<i, = xi is (I - F(x?) + F(x,)] m - By 'he 
formula for the total probability we get 

P(*, = 0|Hi) 

i i 

= "- — [dxt \(1 - F(m) + FMrxT^Q - xiy'dxi. 

y - 2)!<i, - Of J J 

*i 

For i = 1 and i = n + I the probabilities will change. By dividing their 
expressions by n +■ 1 , we will have a zero contribution to the sum and hence 
can neglect them. By summing the resultant expressions, we get 
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L 1 

"(« - 1) f T 

= — + — \ <*xi In- F <*i> + ^wrc - jft + ny 1 * + &,, 

O x, 

where «„ < 2/(ji + 1). We change the variables xi = j»i, Jti = y t + >>i/n and 
write 

O 



F(v,) 



.Vi/n 



ot» + e. 



In the limit as n, in -* °o, m = qh, we get 
lim 



n E — -± Hi 

" l" + l I J 



fltK 



taking into account that the function F is differentiablc. Then (see the him) 



-a 



*!<*) «jw <*t 



t ) < l (i + art*)) d* \ -j- ' 

O 

I 



art*) 



art*) 



I 

since | f{x)dx = l. The equality only holds when the functions giix) and 

o 
gtix) are proportional. Then 1 + g/U) « const, i*,, /(*) « const. Bui f(x) 
is the density function and hence is identically unity. Thus, if the alternative 
is true, then the inequality will be strict, and under the alternative the statistic 
M", m) will asymptotically tend to larger values than under the hypothesis 
Ha. Therefor^ ihe critical region for Ho should be chosen in the form 
\s (n, m) > c\. At the significance level a the critical boundary asymptotically 
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tends to n 

c = c„(n) = ■*- Vn 



1 + d + O) 



3.26. We have Xl - n f "57 -^- - 1 ) « *« < **»* = 9 ' 49 - The 



•izj 



-A_ - l) = 8.09 < xi, 



V 

hypothesis is not rejected. 

3.28. (1) Sec the solution to Problem 3,23 (1). 

(2) Let (X„ Yd, ... y (X„. y») be independent observations on (t,, £i) and 
X = (Xi XJ. V = (Vi Vn). Then the sample means are 



— "i — " i 

x = J-, y = — , 



the sample variances are 



n 



S| = S 2 (Y) = ^± 



'1 



n< 



and the sample covariance is 

I 



s, 



n ^"^ n « n 



Then (.sec the solution to Problem 1.38) 

The theoretical correlation coefficient is 

G(fi£>> - E£[Eh F« ( = 1, fe = 1) " P«. - l)P(fa = ') 



C VDCtDb " ' >/P(ii = l)P«i = 0)P(fc = l)P(fe = 0) 

P0*B) ~ P(/I)P(S) 






Vp(/i)P(>*)P<fl)P(S) 

But 
PMfi) - FM)P(B> = p < B >fejf CW + ^ " tPMW + P(*»>)J 
= P(fl)P(^[P(-4fl)^P{^) - P(AB)/P(B)l 
whence follow* the second representation for q. 
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(3) Consider the random variable 

n 

f- = £ (AT, - P{AMY, - P(B»/VnP( J 4 )P(^ )P(fl)P(B) . 

;- I 

Since for Hq wc have 

E(X - PlA»(Yi - P(B» = E(Aj - V(A))E{Y, - P(S» = 0, 

mx, - v(Amri - p<fl» = Ew - p^rty - p(b» 3 

= E(/T/ - FfyU^Ed"! - PfB)) 1 = IlJGDr, = P(A)1HA}P{B)P(B), 

J"„ is a normalized sum of independent and similarly distributed random varia- 
bles. By the Central Limit Theorem wc have 

S(f.) -* I (0, 1) 

as n -» to and 

m 

2 CV< - Wtf - *) 

g 

- S <*- P(^)K^ - P(B)> - "<* - P(yl))(F- P(B)). 
Using (2>» vfc may write Z„ in the Torm 

b" 1 S wr, - 5)(y f - n 



Z, = n'^Sii/t.SiSi) = 



Vx(i - A^m - y) 



L " ^P(A)PiA) VP(B)F{fljJ L^Xl - A)F(1 - nj 

By the (heorem on the asymptotic normality of sampling moments we have 

p _ p 

as n - «, According to the law of large numbers. X ~* P{A\ Y — P^). 
Hence, 



[ P(.4)P(,4)P(fl)P(g) "I ■« ^ i n" „. 
5(1 - X)Y{\ - Y) J ' ■ " " "~ 



\x - P(A)xr~ P(fl )> p 

VP(>4)P(^>P(fl)P(fl) 



as » "• «. and therefore the limiting distributions of Z„ and f„ coincide under 
the hypothesis Ho. 

We see that q = under the hypothesis W n , while e > under the alterna- 
P 
live //,, whence Z„ -> + cd as n -» co. Therefore, large values of the statistic 
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Z„ testify [hat the alternative is preferable, In other words, when testing the 
hypothesis Ha against the alternative Hi, wc should choose the critical region 
in the form (Z„ > f„). Since 

P(Z„ > tjj-ftj e *(-/„) 

Tor the chosen significance level a the critical boundary is /„ = — ♦■" '(a) = 
**'(1 - a) = u,-„. 

Remark. Reasoning along the same lines, we may show that if a close alter' 
native is defined by the conditions 

H 1 ^: PM8) = P(A)PIB) + 0(n~ ,/1 ), 

e = e «" = an - i/z , a x 0, 

then ./"(Zul/jj"') — ■ I if, 1). and hence for n > the power of the constructed 
lcsl satisfies the limiting relation 



3.29. Here (see Problem 3.28) Z„ = ( ^- - — J /— ° * 

\360 82/-^ 137 



360 X 82 X 442 



x 305 
- 3.86 and Xl = Z\ = 14.89, Since Xo W9». i = '0-8.' the' hypothesis thai the 
features are independent must be rejected (then the probability of erroneous 
decision is smaller than 10"*). At the same time the data testify against the 
hypothesis Hi since Z„ < (this may be interpreted, for example, as the ab- 
sence of discrimination for the women entering the university). 



3.30. Here Z, 



/276 3 \ ,749 X 69 x 818 „ 

:„ = I ' — 1 I— = 5.45 and X% = 

\749 69,/ -vj 279 X 539 

Z\ - 29.70. Using the data we already have (see the solution to the previous 
problem), we reject the hypothesis H a . Since Z„ > * _ '(0.9999) = 3.72, the 
data (see the solution to Problem 3.28 (3)) verify the hypothesis Hi . The proba- 
bility to make an error when rejecting Ha and accepting Hi is smaller than 
10"'. 

JJ1. We are verifying the randomness hypothesis Ho [7, p. 169], In this 
case the number of inversions is Ti = 0. and the probability to get this value 
under the hypothesis H is (8!)" ' = 0.25 x 10 ~*. Consequently, the hypothe- 
sis Ho is rejected for any reasonable significance level. 



3JJ2. We write 
function is 



Ee" 7 ? = exp 



= exp 



T~„ = 6 I 7" B - — - J « -JVI . Then the characteristic 

[- m lj- " W-"'> 
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- R>r |f| ^ c < «, h-*b and any Jt — 1, . . . , n we have 

dtp I6itkn- i/z ) - I = oWm"" 2 - IS/ 1 * 2 /!"' + O^n - " 1 ). 
Then 

We go (o logarithms .and use the formula 



In {I + *) - x + 0{x\ x - 0, 

2 



to obtain 



/■■J f— 3 

whence follows the required assertion. 

3.39. Here the likelihood ratio statistic is 

it n 

/(x) ■ Siw = n w - a**-*/ n <» - «*-* 

Aid - Bo) 

and — — > 1 for 0, > ft,. Therefore, the inequality / 5 c is equivalent 

Pot' - 9>) 
to 7" 5i /. 

Let a be 1 he given probability of Type I error. We find the integer t B from 
the condition 

tit kn 

a' - T, CftW(l - ft,)*""" < o < £ CZV»o"(I - ft,)*"— -«'. (*) 

If a =s a', the sought-for test is non -randomized and has the form 
&% = (r^/„|. 

Since -3#(Tt = S('{*rt, « (see Problem L39 (3)). the probability of Type I 
error for this test is 

Wt*"*', a>> = P;IT > O = a' = a, 
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and ils power is 

W&rUi ».) = p e ,( r s U) = 2 cr-er<i - A)*"—. 

If we have a strict inequality in,(*), ije., a < a', then the Neym an -Pearson 
test is randomized and defined by the critical function 

I for T> f« + t, 

j "~ a \ forr=(„, 

— Of" 

for T ^ l„ - 1. 

The test power is 

wtv>:: eo = E„ i4B :<n = p„,(7* s t« + n + - " " — p»,<r = U 

a — or 

In this case (for a < a ') we may also use non- randomized most powerful tests 
3T\ a . = ITS (-) and iS') n . = [IS f„ + 11 with the significance levels 
a ' > a and a" < a, respectively. , 

3.40. By the De Moiv re- Laplace theorem we have 

A{T) ~ ! (kn», kn&{\ - ff» 

as n -* <=. Therefore, the condition (*) to define the critical boundary („ may 
be replaced by an approximate condition 

/ knBb - <„ \ 
VAT s (.) = * ( ■- ■ ■ ■■■ - J = «, 

whence follows the sought-for critical region. We then have 

{ t- *«ei n > 



= * 

which is equivalent to the second assertion. 
[5— SS'i 



(p I + u„ + 0(1)) + o<l), 



226 Answers and Solutions 

3.41. The test is constructed by the scheme of Problem 3.39. Here the 

g 

sufficient statistic has ihe form T - 2 -*i a "d SiTi = n(/rf>. 



- (£)'**"* 



/(X) =1 — 1 e""'"*''. The condition 



...„.,>«.)" ^ , y^ c ..,.<itft.r 



E • < a ^ /,{ • s a ' (.) 

ml 



allows us to find the critical boundary r„ for the given probability a of Type I 
error. For a = a ' the test has the form *?*\ a — \T *& t a \i and for a < cr' it 
is randomized and its critical function ^„(T) has the form given in Problem 
3.39 {taking into account the notations from (*)). In any case the power is 
computed by the formula 



Il"n "* «>, then ^l(T) ~- ,) inO, n6) and, by reasoning as_in Problem 3A0, 
we find that the test asymptotic form is [T £s n8<> - « a \/nft>), and its power 
under the close alternative 8, = aj"* = So + i3/v7t, > 0, satisfies the limiting 
relation 



lim JP„<fl{">) - * | 
3A2< The observable random variable X has a geometric distribution 



W + "7 

ible yV has a 

— /S.yi - Si „ 

0/(1, 4). and the likelihood ratio statistic is l{X) = I-] — . Consequent 

\So/ i — So 

ly, the inequality / Se c is equivalent to X 3s t , and the critical boundary t = („ 
is found from 

a = 9 = P^{yY > *„) = 2 WO - flo) = So". 

whence f = J. 

In our case Che Ney man-Pearson test has the form c^'ia = \X 3s s\, and 
its power is 

1 - 8 = W&ii » •».,<* £ *> = E *i"« - 9i) = »,- 
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3.43. Here the likelihood ratio statistic is 



and j%{2T/0) = x 2 (2n) (Me the him). If Bo < 6 lt the inequality / Js c is equiva- 
lent 10 7"S !. and the condition for finding the critical boundary t — t a at 
(he significance level a is 

a = P,„(T £ („) = P«,{2T/9o > 2r„/So) = P(xli > 2'„/ffi»). 

Then 2(„/fti ■» Xi-ci*. and the optimal test has the form 3>'* a = 

J T > — xf-a.i n J • Us power is 

where /^.(O is the distribution function of the law x*{2n). Similarly, for $o > 8 t 
the optimaJ test has the form 'S~\ a = J T ^ — x£,i„ ! , and its power is 



3.44. in our case the set of critical values for the observation is found from 

f(x) = ; 3s c, 

1 + (x - I) 1 

which is equivalent to the condition (c - ])x 2 - 2cx + 2c — I < 0. If fs 1, 
the inequality holds for Jt ^ 1/2. This means that the J lo = |Jf 5= 1/2 1 -test 
has the significance level 

dx 1 I I 

— arctan -, 



Po (HHH 



X 2 2 * 2 



and its power is 



-HHJt^-H 



arctan — . 
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Putting c = 2. we obtain (he inequality (.x ~ 2) 2 ^ I or 1 ^ x £ 3. This 
means that the significance level of the J*')„ = (1 4: X % 3[-tcst is 

a = Po<l < X ^ 3> = — \ ■ — = — (arclan 3 - arctan 1), 

»J i+r * 
i 

and its power is 

3 
11 rfr ■ 

arclan 2. 



P,(l < X£l)=± \ — - = - 

ij l + ^-l) 1 * 



3,45. Let X = (Xi, . . ., X„) be a sample from the distribution -^*(J), If 
we have at least one [A",| > a, then this event is impossible under the hypothesis 
Ho and it must be rejected, tn other cases, i.e., for 7^° h max |Jtj| < a, we 

make our decision by investigating the likelihood ratio statistic 

Here the inequality / it c is equivalent to 7j Sl 4 f, and the sought- for lest has 
the form t 

.^;= = i^" > ai u sn" *S a. If***] 

= (7< J >> or r«><r„), 
where the boundary /„ at the significance level a is found from the condition 
o = P(X «.*'Tn|Ho) = P(T?» £ MM.) 



i'i 



<&■,...<**„ (irf t ,)'' 



t+Z +h*,. (2a) " r[^+ 1 J^a)" 



(H< 



(because the event ( 7^ < a) is certain under the hypothesis H ). We find 
the estimate 

W(*";«; «t) > P(!l 1) > o|//i) = 1 - P-dA-,1 «J aLtfi) 
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for the power of the test. Consequently, the probability of Type II error satis- 
fies the inequality 



,<[.-»(-)]• 



for any a. If n -» «>, by the Central Limit Theorem we have 



where 



„ v.i.x\\iM ■ ■ i ->.■•■ be* £>, 



a 



A 1 = D(A, |«b) = — \ Jf 4 *r - / = — a" 



Consequently, we may use an approximate equation 



to find /„, whence 



no* 2a z in 



3.46. We use 7", to denote the number of positive outcomes in n trials, 
Then -*£(7"„) = Bi\n, p), and the event ( T„ > 01 is impossible under the 
hypothesis Ho. It follows that the test should be given as.^J = {T„ > 0), ix., 
we accept the hypothesis H a for T„ = and accept the hypothesis //, in at! 
the other cases. Then the probabilities of errors will be 

a = P(7-„ > OiWo) = 0, = P(T„ = 0|//,) = 0.99". 

We find n from the condition 0.99" ^ 0,01, whence n £ *W» 

3.47. Here the likelihood ratio statistic is reduced to the form 

/(X) = exp f -2- ( p ' ~ *>)* - A («i ~ So) t • 
If fli > $n, then the critical set .^',„ has the form 



If 0o > 61, then we have 

3.48. We have two equations 
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and its power is 

WW = P., \x& ft, - S. uA = * f~ (9, - ft>) + uA > a. 
then we have 

= p<flo + -p««]i ^W) = *(~( e <> ~ *i) * ««)■ 
tave two equations 
*(».) -a. *Y— (ff, -ft,) + u„ J = 1 - (S 

to find /t, whence follows thai n* is the minimal integer which is not smaller 
than o l (u a ■*■ Ka)V(»i - ft)) 1 . 

3.49. Since -/'(Tltti,) = :-/'<£>. I). ^T\H,i = .*(A/a, 1). we are dealing 
with two simple hypotheses about the mean of the normal distribution with 
variance equal to unity, on the basis of one observation on the random vari- 
able T. 

The solution to Problem 3.47 gives the sought-for^"I„ = (7" > -*„ ]-lest 
with 8 = *(-u - 4/o), Taking into account that tig = * " '{a), we find m 
from the equation 

-u„ - A/ff = * _1 09) = Ua or o 1 = A j (m b + ug)~ l . 

Finally, solving the latter equality with respect to m, we conclude that hi* is 
the minimal integer no smaller than ■ 

3.50. Ut Hi: 9 = ft, i = 0, 1, and ft, > ft,. If X = (JC,, . .., AT,) is the 
required sample, then the likelihood ratio statistic is 



»-n ,*»•■*"- 7 n 






! 4. w - *»* 



where >S(77ff J ) = x*(«). The inequality / ^ c is equivalent to T s£ /. There- 
fore, the critical boundary I = l a at the significance level a is found from the 
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condition 

a = p,„(t <£ t a ) - p, (r/e^ < <„/e& - F„{t«/e& t 

where F„(0 it (he distribution function of the law x 2 (n). Then /„/So = x£,„ 
and ths sought-for test has the form 



Its power is 



»W - P.,(7- z elxl.J = p.. ( r/ei < (j) xl.n\ = f„ [\jj &*<} ■ 

For ft) < S, sve find in a similar way that 
assertion follows from 

- 1 ji(4<&<« I 



3.51. (t) The assertion follows from 



SSS(c) = #5{c). 

(2) If c > 1, then we use the first of the above relations to find 

a(c) + 0(c) =g i (I - 0(c)) + &{<;) < 1. 

e 

If e C I. then we use the second relation to find 

a(c) + (3(c) < n{r> + c(l - «(c)) ^ t. 

(3) If t- > I, then we use the relation 

aCc) 4- 0(c) = J /„(*) (if + J y,(x> d* 

S7(e> «fc) 



I - J CAW-* 
jaw 



~Mx))dx. 
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it is clear [hat ^(c) £ *",'([), and we have/,'*) - /«(jt) ^ on the set -^(1). 
Consequently, 

j C/i (*> - MX)) dx< j (/, (*) - /o<*» d*, 
»i«) 7,(1) 

whence «(r) + /9(c) ^ <a(i) +- (3(1). 

If c < 1, then we proceed from the relation 

a(c) + j9{c) = 1 - j (/„<*) -Atje»«et 
'*) 

(4) If the hypothesis Ho is true, then by the iaw of large numbers we have 



«*•(-* 



as n -» oo. Therefore, for £ < we have 

<*„ - P{T,(X) > 0\H a ) S, PilTAX) - 6\ ^ \S\\H B ) « 

as n -t oo. By symmetry (if we change // for H\ and vice versa), /S„ tends 
to zero. 

3.52. In this case 

f,(x) = ~ exp \ ~- (x - n m y\-'{x - j/")} . i = Q, I, 

(2*r /J VfAJ C 2 J 

and the region «*i"(e) = («: /i(*)//b(x) ^ c) has the form 

.*Uc) = jx: a* -Ia'(^Wj + y.») ^ ^ = _,„ J 

a = A -'(,»"* - /'>). 
For the random variable Y = a ' 4 - - a ' (fi (t>> + p (,> ) we have 



-yXl-|//,)=^ (<-!>'§. <?Y 



where g = (^t w * - /'>) ' A" ' (/ 0) - jt*") is the Mahalanobis distance between 
the distributions ^Xp m , A) and ■*r1£ r * w , A). We now find the probabilities 
of Type 1 and Type II errors for the ^(cMest, They are 
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<*(c) = VIY « c, \H ) = * ( ^- 1 . 

/ C! + e /2\ 
0(3 m P(Y > e,|J/,) = * f ^—\ 

If the probability a of Type 1 error is given, then the respective Neyman- 
Pcarson tcs! is defined by the critical region yt>(c) for Ci = g/2 + ^u», 
U a = *"'(«), the probability of Type II error being equal to 
= *(-v^ - u„). The .v?, (l)-tesi minimizes the sum of the probabilities of 
errors (see Problem 3.51) and this sum is 2*(-Vg/2>. 

3.53, The solution of Problem 3.39 implies that the model's likelihood 
ratio is monoione, and we may use the Ney man-Pears on test from Problem 
3.39 as a u.m.p. test for the given problem. 

3.S5. Since -^?(7» = Bi(r, Sy is an exponential distribution with a mono- 
tone increasing function .4(9} = In 9, the u.m.p, test exists and is defined by 
the critical region of the form (7" r ^ f | (see Sec 3.5). By the Central Limit 

Theorem we have -fi(T r ) ~ .*' ( -, ) asr-»», and we may use 

V - * <i - »)V 

the relations 

a * p * m * « " • ((rh " '") /^/t^^) 

to calculate the critical boundary t — t a for large r. We then find the required 
formula for U. 
3.57. We have 

P,(T = x) = /(*; S) = CJCMVCft. x = 0, 1, . . ., 9, 
and therefore the function 

/{x; 9+1) 9+1 A/-0- *_+* 



/(#) - ■ 



Ax; ff) N - 9 + I ~ x 



is monotone increasing in x. Consequently, the u.m.p. exists and has the form 
IT 5 !|, i.e., the hypothesis Ho is rejected when T is too large. 
3.60. Consider the class 

ar la = j ^ (x - $ ) < u„, J u j ~ (s - ■«,) £ -«„, | , 

where a-i + at = a, and the power function is 

F*t*£L.: 9) ■ 4>(vnA/o- + m cti ) + *(-vni/'t7 + w«,), A - 6 - 6„. 
Clearly, the power is minimal for A = Ao = a(rr<., - u ai )/Q-fn). Since 
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Ht&ftai #o) = a (for A m 0), the test is unbiased only for An = 0, i.c, for 
at = ct] = a/2. We then find the sought-for test 



* \~\x -h\> - u a/ j . 



This is the u.m.p. test «mong all (he unbiased tests, 
3.61. The likelihood function is 

n 
Imi 

Therefore, the inequality A («; ft) }s e£(i; ft,) + C|£i(»; ft>), which defines the 
best critical region, takes the form (see the solution to Problem 3. SO) 

c' T ^c-+c;T or [T<ri!U(r> r 2 ). 

Thus, the sought-for test is 

where the boundaries U < h are defined at the given significance level a by 
the conditions Wfpv) = a , w '(ftj) ■ 0, with the power function Wiff\- We 
know (see the solution to Problem 3. 5 0) that 

w{<?) = v^a-t^i) m p.cr^/o-v p«crss rj 

= ^(/l/S 2 ) + 1 - Fn{tl/d\ 

whence we have two equations 

F„((,/«J) + I - FAhMl) = a, t>k„<l t /a& = fa*l,(ft/flj). 

where *„(/) = f „'</). Putting /i = &&,„ h = fl^i -«,», we find that the 
conditions 

should be met. These conditions uniquely define x l,. n and v.?-,,,,.*. and hence 
/i and /j. The sought-for test takes the form 

m~ "W< olxi„.\ u(r> flix?-^,) 

and is the union of two onesided u.ni.p. tests from Problem 3.59. The values 
of (xl,.*, xf. w iB ) for a = 0.05 and n = 2, 5, 10, 2G can be found in 
[7, p. 108]. 

3.62. The algorithm of solution is here as in Problem 3.61. Using the result 
and notations from Problem 3.43, we find that the lest is of the form 

&1 = [T^ MU(r> t,\. 
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and its power function is 

W<$) = Fi„(2d/e) t- 1 - J=a,{2n/fl). 
We have two equations W(8n} = a and H"(0o) = to define the boundaries 
t, and h. For t, = - j(*.*w 6 = "J *»-<u.in the equations take the Form 

X?,.»«fa*Cx2,.a<) = xf-«,2**I>i(Xl-..i.I»). «1 + « - "■ 
We find xj,,^ and Xi-mi.in and conclude that at the significance level a the 
sought- for test has the form 

("V *" i V -*° l 1 

^a ■ j Xj ja =S - x5,.in or £_jX, ? y Xi-.j.m t 

1-1 i- I 

and is a union of two respective one-sided u.m.p. tests for the alternatives H,~: 
8 < 6 and H?\8> 8 which follows from Problem 3.43 and the properties 
of the exponential model. 

3.63. In our case (see the solution to Problem 3.39) the likelihood function 
is 



uk, *) = n c^i - 0)* **, 



I- 1 



"Therefore, 

W fcO-» tat< *"- 7 y-*"'. r ( x)=S^ 

58 8(1 - 8) X - J 

j- i 

Besides (see Problem 2.43), we have tffl = fr/|tf(l - WJ. and the soughi-for 
test (see the hint) has the form 

i = [jr- k-nftil/VSrjiftd - fc) 5= -K./3). 

The test's power for the given alternatives is calculated as in Problem 3.40. 
3.64. We have 

jf 

i(x; A) = e-*'v- r M/(jc 1 !... *,,!). T<x> = 2 */. 

la i 

and hence l/(x; S) = [T{x) - n»)/ft /(«) = 1/0 (see Problem 2.43). The sotight- 
for t«t has the form 



Its power is calculated as in Problem 3.41. 
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3.65. We use the notations from Problem 2.119 and find that the critical 
regions J\„ for our problems have the form s 

(I) <0U = (x: x £ 8,o + h- a .n-tS(x)/Vn^JU 

{2)- J /,„ = fx:x $ 6,o - /i- a ,„-,S(x)/Vn - 1 1 (compare with the results 
of Problems 3.47 and 3.58 for the case when the variance is known); 

(3) «*!. = j «: -Jn - ' ; ■■ * ■ ? 'i - <*/!,«- i { (compare with the results 

of Problems 3.60 and 3.73); 

(4) *",„ = [x: nS*M > fl&,xi-„.,-il; 

(5)^i„ - \x:nS*M < «2oxl,»-i( (compare with the results of Problems 
3. JO and 3,59); 

(6) ^,„ = |ic rtS'W < flIoxi/ !r „-i or »S l W S ffldXi -««.»- tl (com- 
pare with the results of Problems 3.61 and 3.74). 

Indeed, let us investigate the fiist problem (all the other problems are 
solved in the same wa y). We use the lower -y-confidence interval J^(X) = 
{Qy.JC- r,,»-iS(X)/>m - 1 < Si < °°1 for 6, and find that the acceptance 
region for the hypothesis Ho with the significance level a = 1 — y has the form 



Ma = (k: X - t,,,-iS(x)/Vii - I < Sjol. 

But s#J« = aS„ is as given in the problem. 

3.68. Using the -^-confidence interval for t constructed in Problem 2.127, 
we find that the acceptance region for the hypothesis hit, is 



^Oa = \ (.«, >): -P~tt--rVl.2n.2m < ' < - ^<l+-t)/2,2ii,2m \ • a - J - 

s required test is defined by the critical region 

?\« = { (x. y): - ^ Faa,i*,3m or z 5 F, -<,n.i~,t* \ , 

L y y j 

tr is 

■ P« ( r^^ T/ ? o/i,ij,,j», 1 + Ps [ r=.5> xFi-o^i.in.iM 1 
\ ▼ J \ Y ) 



and its power is 
WW 



Si 
- -F(T-F < ,/2.2n,2;n; 2n, 2m) + I - ^(rf) - a /i,in,2iii; 2n, 2m), r = — . 

3.69. We invert the -y-confidence interval constructed in Problem 2.128 and 
find the critical set for the hypothesis tfo, viz., 

£3. = l*: *(i) <S fl o or *t»i S 9 « _ 0° <*V")- 



Since 



f\ for t < e, 

'**■>»"-[«-"<.-» for f»« 
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(see the solution to Problem 2,128), the power function of the test is 
WW) = ?t{X m sj ft,) + P»<*t„ > ft> - (In «)/«) 



■(■ 



1, Ss ft> ^ (In «)/n, 

etc"""**', fti < B < ft, - (In a)/n, 

1 - (1 -c.Je"'*" 9 -', fl «ft>, 



whence follows that W{6) > « for all e. 
3.10. The sought-for tesi has the form 

#i a = [*: Jf(«) C ft>a ,/B or x w > ftil, 

and its power function is 

W(6) m Pj(X<„, ^ Aib"") + P. (AW Ss ft>) 



-™n(«0y.l) + .-«» B ((£)\l) 



1, < fea l/n r 

<1 e s£ ffn, 



-<■-(?)■■ 



whence it follows that IV(fi) 3 a for all $. 

3.71. The assertion of the choice of the boundaries in an unbiased test 
follows from the fact thai the distributions of the respective statistics coincide. 

3.7 J. Inverting the confidence region constructed in Problem 2.132, we find 
the critical region 

*im = {*: «<Xi - Sm>. & - dia)'£"'Cfj - fl«o. xt - ftso) > Xi-„.i)- 

3.73. Here 9 = 1 8 = (fli. Si): - °° < ff, < <=, ftt > 0) and (see the solu- 
tion to Problem 2.86) 

sup £(x; 0) = t(x; C*» ^)) - (2ttcs 1 )-*' 2 , ** ■ S*{x). 
& 

We have ft, = [B = {6 U faY- #i "■ Sio. 9j > 0) and 

sup L(x; « = L(x; (8m. sb» = (2«s£> " " /2 , 

1 n 
where io - — S ( x f - "m) 1 ' s a rn.l.e. for fff under the hypothesis H , Since 

« i-i 
si - j 1 + pf - flio) 2 , we have 

k. = Mx) = CsS/* 1 )-"" = (1 + (V(» - I))"'"' 2 , 
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where / = r(x) = V/i - 1(5 - &io)/s. Therefore, the inequality A* sg c is 
equivalent to |l| ^ c' . In this case the likelihood ratio test has the form 

££, = (x: Vn - 1|S - S,«|/*&c')- 

Since the test statistic r(X) has Student's distribution S{n — 1) under the 
hypothesis H a , the boundary will be c' = ii-./i...i (compare with Prob- 
lem 3,65 (3». 

The S(ji - 1) -distribution is approximated by jy (0, 1) for large n (see 
Problem 1.47). We may then take the critical boundary c' to be approximately 
— Kb/i. Note also that u* /2 — xt-«,i- The information matrbt 1(0) for the 
model •d'&i, #1) was computed in Problem 2.44, and the first principal minor 
of the matrix I " l (fl) is S*. According to the asymptotic theory of m.lxCs, the 
maximal power for the given alternatives is 1 - Fi(xi-a.i\ **)> where 

3.74. In this case (see the solution to Problem 3,73) 

sup L(x; B) = £,(*; <*, $)) = {lucts 1 )-"' 2 , s 1 = S 2 (x). 
e 

su P L(x; 9) = L(x; (*, Pa,)) = (2 1 rSi 1 ft )-' ,/i e""' i/1 *''», 
whence 

u*> = (te"*>y"*, i = sVffio. 

Here the inequality X., < c, which defines the critical region of the likelihood 
ratio test, is written in the form it sS, (, ) U (/ > r z J, t, < /j, and the test is 

,<?T = fx: S*(jt)/flfi> < /i or S^xyflJo 5s RtJ. 
By the him. the power function of this test is 

—Of .)♦-*-(*■)■ 

Choosing /i = x5,,n-i/», 6 ■ Xi-«,.*- |/« for or, + at = a, we find that 
H^fSo) ■ o, q*.d. (compare to Problem 3.65 (6)). 

For the lest to be unbiased, the condition W'ffis)) = should be met and 
we must solve the equation 

xl,,,-ilc*-iixl„*-i) = xf -<.„.- i*"-i (Xi-™.»- 0- 

Remark. The asymptotic (as n ^ =o) version of this test was investigated 
in [7, pp. 212-213}. 

3.75. Since the maximum likelihood estimate for the parameter 6 in the 
model Bi(l, #) constructed from the sample of size n is S„ = A" (see Problems 
2,84 and 2.4B), we have the statistic 

«*a - *&***-* 



jr^d - AT" 0- - 10 
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In our case the model is polynomial with jV = 2 outcomes, and therefore (see 
[7, pp. 207-208]) as n -* no the Lmiting distributions of the statistics --2 in X« 
and 

, <r - netf in - r - «(i - ft,))* (T - /.ft,)* -, 

X\ = — + —■ — = — , T = nX, 

«ft, rt{\ - ft.) nft,(' - ftO 

coincide under the hypothesis Ho and are ^(l). This means that 
j% a (T) - --^(rtft,, nft>(l - ft))) a!n-»». Thus the [V « c|-test is asymptoti- 
cally equivalent to the {JfJ ^ t j-test, which coincides with the test constructed 
in Problem 3.63 for k => 1. __ 

3.76. In our case the m.l.e. is &, = JC = 77n (see Problem 2,109), and we 
have \„ = (nflo/r) r e T ""'''. It follows front the solution to Problem 3.64 thai 
the statistic Q?> = (T - nft>)*/n0 o , and the likelihood ratio test is asymptoti- 
cally equivalent to the | [7* - n9o\/yfh~8a > /(-test investigated in Problem 3.64. 

3.77. We use Lj(Bj) = ffj^U - 9j)">'' ' * J> to denote the likelihood func- 
tion for the jth sample, j = 1 *. Since the samples are independent, the 

likelihood function for all the data is £<»,, . . ., 0*) = Z,,(ft). . .U(0t). The 
m.Lc. of the Bernoulli model parameter coincides with the arithmetic mean 
of the sample, and we have 

* * __ 
max Lifi ft,) = II wiA) = II L AXji 

ft, ,a» v-i */ j-i 

* _ — — 

max £(»,, .... ft,) = ma*£,<ft ..., 9) = max ^(l - fl*"-" 



JT"*(1 - Jf)" <1-Si , 



where X - — (mJfi + . . - + n*A*), " » "i + - - • + "«• 
rl 
Then the likelihood ratio statistic bas the form 






n/£\"A / l -jf W 1 -*> 
W Vl - %) 
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The dimensionality of the null hypothesis is dim H = 1 (one degree of free- 
dom), and the asymptotic likelihood ratio test has the form 

£&. = J 2 S "AmQ* 5tj - In 5} + (t - xj) 

X (In - X,) - In (1 -X»J ^xl-^-A- 
The standard x 2 uniformity test (7, p. 161] is 

ix(] - X) *—• ) 

3.78. We solve this problem as the previous one. Using the same notations 
and taking into account that 



LASH = e""'V*Y II- Xa\. jm I. , . 

i k 



we find that 



For n,, . ., 11* -* » the likelihood ratio test has the form 
(*!*- J2 2 "Aflnly-lnJtiif-^-,]. 

3.79. We use Ljifiij, 8i) to denote the likelihood function for the jth sample, 

t 

j = 1. . -,k, and £(fljj, ...,*it, <fe) = XI ijffw. W to denote the likeli- 

/-' 
hood function for all the data. We stipulate that n = fti + ... +• 'i*, 

I * I * 

Sj = — 2 "jS/, AT = — 2 »»;.¥/. Let S 1 be the sample variance for all the 

* ft n j-\ _ _l_ __ 

data. Then (see Problem 2.114) (Ai At, So} and L(Xi, . . ., X k . So) = 

(2»eS^)"'"' 2 arem.l*;s for the parameters (On, . . .,0i*. tfj). Under the hypothe- 
sis Wo all the data may be considered to be a sample of size n from the popula- 
tion . t'ifiy, e|). Therefore (see Problem 2.86), 

max i(fl, Pi, fli) = L{X, ...,X~,S) = (2ireS 3 ) """. 
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Thus, the likelihood ratio statistic is 



*"--©""*-(*«7EW)" 



The number of the model's parameters is Xr + 1, and the dimensionality of 
the null hypothesis is 2, Therefore, the asymptotic likelihood ratio test has the 
form 

*£,-„ - 1/TflnS 1 - In SJ) > x\ -«,*- 1 1- 
If Jt — 2, then we can directly verify that 

nS 1 = n,Si + mS| + — (JFi - X 2 f, 

n 

whence 

*=m = + **/<» - 2))"" /2 , 

with T defined in the statement of the problem . Here the inequality >^ tnz ^ c 

is equivalent in |T] S t, whence follows the required form or the lest. 

3.80. Let Ljifiij, fly) be the likelihood function for the >th sample, 
k 

j — I, . . . , A:, and L = J J Lj(&ij, &jj) be the likelihood function for all the 

J-r 
data. As in the previous problem, wc have for the common model 

max L - JJ max Ljtfij, Bzj) - Tl (2ireS/j " V s , 
while under the hypothesis Hq 

max L = max XI M*y. ftO = (2ircSo> " "*. 
Whence 

w..« - n &/ss = n (S//s,)\ 

Here we have 2fc parameters, and the dimensionality of the null hypothesis 

is Ar + 1, Therefore, the asymptotic likelihood ratio test for »t ffi* -• » 

has the- form 

*l* - | 2 "J( ] " So - In Sj) 5s xi-M-i J- 



16 — SKv 
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For k = 1 we. have 

and the inequality X, |ni ^ c is equivalent to the condition 

[F^dlUj.F^ ci\, c\ < a.. We use the hint and obtain the required test, 

3.81. Ail Xi are distributed as I (fit , 2 ) under the hypothesis Ha. But 

in this case (see the solution to Problem 1.53) the distribution of the statistic 

Xi - X 

n = - is independent of the parameters (fit, 61) and is symmetric about 

Vn - IS 
zero with 



*tn 


> u> 


.»., 


f 


- B a ! 


it - 
~~2 


2 


Consequently, 
















9t-*uAH& = 


Bl 


'«" 


^; 


n 



% 



< it < I. 



i.e. r the probability to make an error when rejecting the hypothesis H B Tor 
"■i)| > v* is «. Thus, ?i„ is a suitable criiicaJ region. 

3.82. Since here the hypothesis // is equivalent to cov {£,, fe) = 0, by 
Problem 1.59 (c) we have 

./"(eWG* - 2)/(l - cJ)|HW = S(n - 2). 

Whence (due to the symmetry or Student's distribution) 

P(le„|V(n - 2)/(l - gl) > h-*2*~»V& = a. 

The latter inequality is equivalent to (hat defining the critical region 3>\ a , and 
hence P(.'^i a |Mi) = a. This means that the probability to make an error when 
rejecting H a far p„ 6 -»i a is a, ijt, #i a is the sought -for test. 

3.83. By Problem 1,6$ S(T„\H„) ~ 1(0, I) as n -* =s and, therefore, 
P(|r,| 5» -u M |// ) - 2+(u„, 2 ) - cr, q.e.d. 

3.84. Suppose that the observable random variables X lt .... Jf„ are in- 
dependent and have the same and known mean p and finite variances. If the 
hypothesis H a implies that all the variables Xi are similarly distributed, then 
for large n the statistic T„ is approximately normally distributed as - * (0, 1) 
under Ho. Then we can use this statistic to construct the goodness of fit test 
for Ho as in Problem 3.83. 

3.85. {]) We have 

w„(fto = P«„(r» » 7„) = p«„{v^(r^ - ptftyj/^ft,) > -«,), 
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and by property (a), W„{0o) -» *(«„) = a as n -** «. ia, the s'io-test asymptot- 
ically has the significance level a. 
(2) Reasoning as above, we have 

*■,(*<"') - p*,,tr. > -v») - p„..(V«{r 1 , - t&*WW*i > <•>■ 

where 

According to property (b), -*„(«) — py(So)/«(<W + «■« as " ~* ™- B y 
property (a), these relations give the required limiting equality. 

3.86. From (be previous proof we have 

*{8*/7l+ «.) = Urn W? (k + JLS m i™ h^.' /fla * JL J~J 

= lira Wgj ( So + -=r J = *(3VJiA + u„). 

Whence Ve? ■ ^e^X, Or X = < 2 Vfi'. 

3.87. In our case ^(TV 1 ) ■ -'-(ft <rVn). and (see Problems 3.86 and 3,85) 
the test has the form 

.#¥£ = lX> So - u a a/4i>). 
Its Pitman efficiency is 

ft (ft a) m *(j3/o + u„), e{ m »-* 

Ws gel y»(7i M ) ~ . ; (ft noV(2n)) in a similar way (sec Problem 1,32), 
whence 

Consequently, X = tjAj = 2/t. 

3.88. We write the likelihood function L{x; ff) as 



Ux; 0) = ((2Tr) n |£|)-" 2 exp 



HH' 



where the quadratic form Q»(n) ■ (x -#Q'£*""(* - 01) can be represented 
as a sum 

Q«(x) = QtoW - Mi'fi-'x + J Qo(i). 
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We note that 1 is [he last column (and the last row) of the matrix Z, i.e., 
<0. ..OT)E a i', and find I'E" 1 = (0. .01), whence t'E -1 ! = x„. Then 

C<("0 = GoW + ft'fn - 29x„, 

which means that L(x; 0) = g(r(x); 0)rj(ii), where 7"(ii) = JC„ and 

g<T; 9) = exp J 07" - - 2 l„ J . According to the factorization test. X„ is a 



.{•r-Iftl. 



sufficient statistic, and all the statistical problems for this model arc solved 
on the basis of this statistic. Since y&(X„) = I (.& U, M, we have a normal dis- 
tribution with an unknown mean. Then our problem is equivalent to verifying 
a hypothesis on the unknown mean of a univariate normal distribution when 
one observation is given, which was made in Problems 3.47, 3. 58, and 3*60, 



To Chapter 4 
4.1. We have 

4*2* If not all the I,- are the same, then 

g, - X- *&, ft = S (f. - ow - *)/ S C' - ') 2 - 
... / i-1 

Here Eft = &„ < ■ 1, 2, and 

h ft jt 



.■ - 1 



Therefore, if Jj {ft ~~ 0* ~* °° as« -» **»* then the estimates & are consistent- 

H 

4.3. We have S 1 = Sut)/(» - 2), where 5(0) = 2 C* - *)* - 

i- 1 

j&l 2 tfi - 7) 1 , and |§ = (&, Si) are as in Problem 4.2, If D5(S) - o<V) 
as n — oo, then a 2 is a consistent estimate for a 1 . Specifically, for the normal 
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mode! we have E(S(f3)/o 2 ) = x 2 (" - 2 ) P. P- 23i i and > therefore, 
DS(£) = 2a\n - 2) = etn 1 ). 



7s ft-V. 

4.6. Here / = ^ft^^, /- S*^^ Therefore, Ef - /. 



4.4. We have cov {(J, , fe) = - a 1 / 

i ] 



3 

d/ - y] ib ~ a)> * r cov &. m. 

j,'— • 
4.9. The confidence intervals arc found from 



f & * (,,„«,,-: J S( ^/ (n " 2) S <'' " «>* ) ' 

The -^-confidence ellipse has the form 

•4(-*l = j »■ W\ - fi>* + 27(0, - 0i)(ftj - ft) 

n ■*— ■ n(rt - 2) J 

where J3i, ft, and S(|J) are taken from Problems 4.2 and 4.3. 

a 

4. TO. Since t #(f) dt = 2flft, we are dealing with the confidence interval 
— # 
for ,81 from the previous problem. 

4.11. Here we.have the theoretical dependence o(f) = a(0) + ut, and the 
confidence ellipse is constructed as in Problem 4.9. 

4.14. We substitute in (4.11) m = 3, X, = (100), Xi = (010). Xj = (001), 
and find the soughl-for system 



|(ft± /_L-(^S(/S)K„ 3 .._ S Y 7=l,2.3j. 

4.1S. We use A'i , . . . , AT B to denote the consecutive observations and obtain 
the following normal regression model: 
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X, = ft + ft, Xi = $t + &, -ft = ft + ft + £j, X, - 180 - & - ft + £4, 
>*i = ft + e_ t , ^6 = ft + £*, -Y 7 = ft + ft. + e 7 , A's = 180 - ft - ft + e e . 

4.17, Taking into account the hint, we have 

ES(ft- = ES(0) - E(j - ft'A(J - ft, 

where ES(ft = E £ £,'= no 1 {see (4.4)) and 

tm J 

E(0 - ft'AfJ - ft = S o</cov(ft, ft-) 
■*.j.i 

A h 

= o 1 2 °»ff* ' = ° i,r (AA _1 ) m tea*. 
tj*l 

Wc ihcn obtain E5(0) = (n - k)o*. Since X - Z'/S = BX (see (4,5)) and 
B* = B, we have S(0) = X ' B 2 X = X ' BX. 

4.18. Here A = ZZ ' is a diagonal matrix with the diagonal elements z/ij, 

j= 1 *, where xj is the j'th column of the matrix Z'. Therefore, 

hi ~ zfX/zjZj, Dft = cVifo, cov (ft, Br) = 0, j s* r. 

4.21. The likelihood function for the model (4.3) has the form (see (4.4)) 



h^4 



and while maximizing it in we minimize the quadratic form S(x; ft. It follows 
that the m.l.c. a coincides with the l.s.e, $ and (he value d : of a 2 minimizes 
S(ft/o J + n In a 1 , whence a 2 = S(B)/n. taking into account Problem 4.17, 
we get 

Et? - a * - fe— £ _ A ^ = -^ „*. 

4.22. The solution follows from (4.10) for m = 1, T = X ' . We should lake 
into account that F y±ll „-t = *(i+ T vi.»-* ( see Problem 1.50). 

4.23. Putting A = (I, t) in Problem 4.22 and using the results of Problems 
4.2-4.4, we find the sought-for interval 



(' 



X + {t - />& 

/ 1 

4.25. Substitute \ t = t™, i = 1, .,.,«, into (4.11), 



-' J=«^2) «*[• + n(t - 7) i% «* - j)i ])' 



To Chapter 4 247 

4.27. We have m = 1 in (4.12) and 

St = mm S £*J ~ S»6 - ft) 2 = 5<ft + (ft - ft-)* 2 fo " 3*- 

ft 1st ' - I 

We also have F,- a ,i. n -% = ii-a/i.»-i($ee the solution to Problem 4.22). We 
ihen find thai the F-tesl (4.12) has the required form. 

4.29. We have a normal regression model with n - 8, (I = {m, pi, w, in), 
and the respective quadratic form is 

SOS) = S (XJ° " *)* = S (*f - 5*V + 2 S (* w " w) 1 - 
'.J U ' 

Hence the l.s.e. is £; = ^ w . r = I, .... 4, min S(0> = 

Under the hypothesis Ho we have 

St = min 2 (Xf> - u) 1 = min ( £ <X« - 2)* + n(F - rt? J 
» i.J * \ U r 

= £ (xf - W = s, + 2 £ CF» - 55*, 

I.J < 

and for m = 3 the F-test (4.12) has the form 

ite an » x 2 matrix c 
oh) and find the s 

..,.|ll|c-„. 



4.37. We use Z' = |lo| to denote an n v. 2 matrix of the column-vectors 
1 = (1, . . ,, 1) and a = (en, .. ., *,) and find the solution in the form 



^ = A'ZG'V, where A = ZC"'Z' = — G"'^"!- We then multiply 

the matrices ("row by column" with the block-matrices treated as the matrix 

elements) and reduce the result to the form # = (0i,0i) = -(a'QY. -l'QY), 

where A = (1 G" '1X<*' G" 'm) - (1 G~ 'o) 1 and Q = 

G"'(al' - lff')G _i , The matrix of the second moments will be 

, , , & II a'Ca -«'C-'lt| 

A l-l'G-'a l'G-'l U 
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To Chapter 5 

5.1. (I) Here the sample space &'= [0, 1 ] consists of two points, and we have 
two solutions for every x 6 ST. We then have four decision functions h = (5i(0), 
MO), i = 1, 2. 3. 4. 6, - (</,, (/,), & = <</,, rf 2 ), &i - (A, A>, £4 = [d*. rfi). 
Let Rj = (R(Si, ii), Rf.Bz, 61)) be the risk vector for the procedure */, where 



Then 



«<«, A) = L(B, &,(0)KI ; - 6) + L(8, 4,(1 »ff. 



.***-(H).*=(M).*-a 



0). 



The procedure fo is preferable compared to 61, while 5 ( , 5j T 4j are incomparable 
and hence form the set of admissible decision rules. Here m(Si) < 
milk) < m(5i), i,e., & is the minimax procedure. 

(2) The first solution. For the Bayes risk riii) = aJt(8\ , Si) + (1 - cr) x 
«<flj, SJ we have r(6,) = 2(1 - a), r(&) = (2 - a)/3, /■(&,) = a. If a £ 1/2, 

then min /•«;) = r{&,), i.c, a* = a4. If 1/2 < a £ 4/5, then min r(S,) = r(Bz), 

1 i 

i.e., 6* = &. If a > 4/S, then 5* - 5i. The Bayes risk is plotted in Fig. 8. 
The second solution. We calculate the a posteriori distribution 
t(«(|*) =/(« ft>*(fc)//(*>, * = 0, 1, ( = 1, 2, where /(*; $) = 6'(l - 9) 1 ", 

fix) = f( X ; e,)i(*,) + /(*: fcMfe) = a|(i - e^) 1 '* + e|(i - <y l ~ x x 

(1 - a). We have ir(fc|0) = «(1 - 9,)//(0), ir(ft|0) = (I - «>(! - fc)//(0), 

(1 - <*)fo 
t(Si|I) = <*0i#(l). ir(*i|l) = — . In our case the average loss with 

respect to this a posteriori distribution is 



E(£(S. rf,)|0) = L{9 t , rfi)ir(«,|0) + L(6 2l A)ir<8i|0) 



2(1 - a) 
3/(0) 



for jc = and the solution 5(0) = d-\, while the average loss for 6(0) = rf s is 
2a/3/(0). We compare the losses and see that for a < 1/2 the loss for tft is 



pioe) 1 ' 
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smaller, i.e., 5*(0) = A, while for a > 1/2 we have S*(0) = A- We act in a 
similar way for the case of x = 1 to find that ** (1) ■ A for <* ^ V5, while 
**(1) = A for a > 4/5. We have thus found the Bayes solutions &*(x) at every 
point x = 0, 1 for all a priori distributions. 

S.l. The average losses E(Z.(fl, rf)|*>, calculated by the given formulas for 
the a priori distribution 7r(fli) = a, a e (0, 1], are given in the following table: 

A A A 



(1 - «)//{f» 


3a/4/(0) 


(1 + 2a)/Sf(0} 


3(1 - «)//(!) 


«/4/(l) 


(3 - 2a)/8/(l) 



We compare these values to find the minimal one in every row (and thus find 
i*U)> and get $*(0) = A for, a s; 1/4, 5*<0) = A for 1/4 < a < 7/10, 
«*(0) = d, for a > 7/10, 5*0) = A for o. £ 3/4, 6*(D = A for 
3/4 < a $ 21/22. fi* (1) = A for a > 21/22. Thus, the sought-for Bayes solu- 
tions *• - (6*(0>, i»(l)) are 6* = (A. A) for or < 1/4, &* = (A, A) for 
1/4 < c ^ 7/10, S* = {A, A) for 7/10 < « < 3/4, fi* = (A, A) for 
3/4 < a £ 21/22, 5* = <A, A) for or > 21/22. Since 



eta) = /■«•) = 2 /(Jr)E(jL{e, «'<*»|x), 

jr-0 

we take the respective values of E(£(fl, i*(x))\x) from the lablc and find 



e(«) - 



a 


fur 


(1 + 4«)/S 


for 


(4 - 3a)/4 


for 


(11 - 10«)/8 


for 


4(1 - a) 


for 



« a « 1/4, 

1/4 < a i£ 7/10, 
7/10 <a < 3/4, 
3/4 < a ^ 21/22, 
21/22 < a £ 1. 



The function o(a) is plotted in Fig. 9. 




Fig. 9 
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5.3. For the decision functions of Problem 5.1 the risk vectors are 

B?» = (0. 6), Rj" = (^. |) . R|" = (|, . Ri» = fe, 0) 
in Che first case and 

Ri 2 > = (0, 6), R?> = (-, £\ , Rj 2 > = (i, |V «P = fr, 0) 



in the second case. We thus conclude that .the procedure 63 is preferable com- 
pared to Si, and then b\. Si. and e 4 are the admissible procedures. For the 
values a = ir(t>0 with the Bayes solution i" = 6j the risks satisfy the inequality 
r w (6*) < r 0) (S»). 

5.4. Here the risk function is 

i 
Btf. *) = 2 t(S, S(fc))CjtJ*(L - «)'-*, 

and the risk vectors R, = (R(8\., Si), R(fli, *i)) for the given procedures are 
Ri = (5.94 x 10 **, 0.792), Ri = (5.96 x 10 "*, 0.972), Rj = (2 x 10 -6 , 
0.999), respectively. The minimax solution is £ = 5i and m(6) = 0.792. 
5 S. Here /Or. •) = (1 - 8)9*. x = 0, 1, 2, . - ., and the risk function is 

m, m = S t(ff. M*))/<*: 0) 
■r-o 

t- 1 « 

= /.ce, tfi)<i - a') + i(e, a % )e\ 

The risk vector for the t'th procedure is 

R, a <K(ft, 5,), *(ft, s,» - (<*»!, 6(1 - f»j)). 

As i increases, the first coordinate decreases and the second increases. There- 
fore, for ofli ^ 6(1 - ft) the maximal risk is m(&) = 6(1 - flj), and hence 
the minimax procedure is $= 6j. 

If oft > 6(1 - ft), then we find an integer 10 from the conditions 
ar?S> £ 6(1 - S'i), rfi« +l < 6(1 - 0^') and obtain 



(.6(1 - o£). t > ;„. 



In our case S = &, if flfl! < 6(1 - fll" * '), otherwise I = it, 
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5.7. By the genera] theory (see Sec 3.4) we have h t U) = birifiW, 
fate) = tfirt/i(x), and therefore the flayes solution 6'(x) has the form 

d, for xt W,\ where W\ = [x: hi(x) ^ ^(x)) 



di for x € W u 






The respective risk vector is {aVi,{X i W\), bP tl {X $ Wj», and the Bayes risk 
is 

For f?i < fa in the indicated normal distributions we have W\ ~ 

lx: jr£ -— 1 , c = In , and P»,(Xe W,) = * I v I, 

(^ 2 Oi-e,}' air,' \ Vi } 

■ P»,iX € W\\ = 4 / c . ~ e /2 \ _ whete _ (e , _ p^Vff 1 . If 0, > 6i, then the 

region TV, is defined by the inverted inequality and all other relations remain 
as before. 

5.9. For the decision function fi(x) a d the risk function is R{8, S) = 
L(6, d), and ihe Bayes risk is 

r(d) m oi(0, d) + (1 - «)L(1, rf) = ad" + (1 - ocKl - rfV\ 

Minimizing this wiih respect to d. we find the sought- for decision d*, i.e., ir 
a = 1, then rf* = for or > 1/2, </* = 1 for a < 1/2, while we may take any 
di [0, 1] as d* for a = 1/2. if a > 1. we have 

t 



-('♦terr- 



5,10. (1) W: write P ( ,(JT( W^) = j /,(x) it-r <for the absolutely cominu- 
ous case) and use (he definition of Aj(i) to find 

ni*> = 2 »rfW) = 2 ( i*jm dx. 

i-i /-i ^ 

By the definition of the regions W] the latter expression can be written as re- 
quired. 

(2) It is obvious that 

l( £ *dm - xt/Ax)} < hj<jc) Z l( £ *,Mx) - v/jMJ 
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{we have taken into account that l(J\J) = 0, j = 1, . , ,, k). Then 

01 (see the hint) 

k 
i, 2 ra * n (*ifit>& mat mfiC*)) £ min A/(x) 

ft 

But 

min i*ifiQc), max «■///(*)) sS 2 min fa/rfr)> «#(*». 
X' /<( 

which gives Itie upper estimate Tor r{6*) after the integration. We then have 
min (t,/,(x), max r/fj(x)) 3s min (iri/i(jc), r_Jj(x)). j - I, ..,, f- 1, 



tmin (Tif!(*>, max rj/j(x)) dx S> max /y, 

which gives the lower estimate for r(5*>, 

The estimates become strict equalities for k = 2 and /(2| 1) = /(1|2) = 1. 
5.11. We have 

fta = Ti[/i(x)dx + 7rcpi(x)rfx, x = (x,, . ,., x r ), 
where sfcj = |x: tj/,(x) ^ irj/Hx)) and 

Simple transformations allow us to write -^ as 

&l = fx; a'* - i«'< M "> + M tt) >« In ^] . » = A-'Gt*" - (i w> ). 

Consider the random variabie Y = a ' X — — a ' (p* 1 ' + (j tw J. If ^1(X) = 
ft* 10 , A), then ^"<n =V(e/2. o) and 

l^*-K^ to r)-(( te 5-f)/4 
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We find 



a similar w 

-•(-$■ 



in a similar way, q.e.d. Specifically, for m = ttj = 1/2 we have 
For Xi > H in Poisson's distributions n(\0 and n(X 2 ) we have 



+ *je * , 



where 



*' 






r: r < m 



\i ~ X2 + In 






m 



I*] being the integer part. Thus, 

= X — ■ 

1-0 
5.12. We use the results and notations of the previous problem and find 
thai the best classification region has the form 

W\ = [x: M») = 'OPJ'i/iM < M*> = ((2|l)*i/iWI 



f - « 
[ x: ax 1 

I 2 



'(j» <l » + / 1> )5 c = In 



*.'(2] 



8- 



Thus, in the Bayes solution 6* wt assume that the distribution •* , '(/«" > , A) is 

true for the observation x f WJ, otherwise (for x € Wl = K'J) we assume that 

' (^ (I> , A) is the true distribution. The risk vector (see the above solution) is 

Um - ('Pill J /1 (*)<*. W J /2W A j 

,(, 8IM (^), ( „ M (-^)). 
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Consequently, the Bayes risk is 

or IQ 
ab 

ilution is c 



For_/(2|l) = <(l|2) =1. ti = T2 = 1/2 we have c = and r(6*> = 

* { 1 . In order to obtain the mini max solution S, we find c (and (he 

least favourable a priori distribution (tti, it;)) from the equation 



For 1{Z\\) = /(l|2) = 1 the solution is c = (i-e., iri = ir t = 1/2), and then 
the maximal risk is m(S) = * ( — j . The minimax rule is defined by the 



region W\ for c = 0. 

5.13. (I) If ^(0 ■ Bi{m, *), then /(x; S) = 8"'{l - $)" 
9"{l - $)" m ~", and the a priori distribution density of the parameter is 
*{fl> = e'* l (l - (>*"', For the a posteriori density we have 

t(S|x) = irWAx; 9) = *"*-'(l - g}**«*-»-», 

i.e.. *(0|x) is the density of the beta distribution B{a + x, b + nm - x). 
(2) Here /(x; 0) a E ' r '(l - fl) 1 ", x(9) is given above, consequently, 

ir(9|x) = e"- r -'(l - S)'" tn '- 1 . 
<3) Here /(x; 6) = e*"'fl Cl ', ir(6) = fl w e-* / ", and hence 






ir{fl|x) = S x **-'e - 

U the density of the gamma distribution r 

(4) Here /(x; fl> = B*e~ *', jt(0) is given above, consequently, 

ir{ff|x) = (*♦■- " e -•<"+»'._ 

(5) Using the indicator, « write the sample density in the form 
/(*; &) = 6~" US ^ jf W ), *<„) = max (*, , .... x n ). Similarly, r{9) =■ 
0-"~ l I(8 ^ a), whence 

(6) If Ai , .... kw are non-negative integers which meet the condition 
k t + . . . + k N - ji, then /(h; tf) =- tf?' . . . tfjf, and hence 

ir(B|h) «■*?**»"*.,. •&•***" ', 

i.e., we obtain the Dirichlet £>(a + h) distribution density. 
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(7) The distribution density is 



T - I 

— [-;(?*?)•"•(?*£))■ 

where we have omitted the powers independent of 8. The latter expression is 

proportional to <xp \ -— — (0 - in) 1 f , q,e,d. 

5.14. Let Jf = Jf£(l, ..., n — 1). The a posteriori distribution is 
w{0|*) «■ 8"{l - S)"~', and the average loss for the decision 6(x) = d with 
respect to this distribution is proportional to 
i 

\(d - 0) 2 0*" '(1 - 9)"- — ' o"0 = rf*B{r, n - x) - 2dB(x + }. It - X) 



■ tip 



and the a priori density is x(9) = exp 1 - — - (P - (0 3 { . Then 



..(,-!)• 



+ B(x + 2, n - jr) = o I rf 1 + c 



This expression attains minimum for d = jr/n. If * ■ or n, then the integral 
is finite only for d ~ (or d = 1). Thus, S* (x) ~ x/n for any x. We also have 

R(8. &*) m E« ( — - 8 \ j8{\ - 0) = D, ( — \ /s(l - 0} = i- a const. 

Consequently, 5* is the minima* solution with the risk r(fi*) a 1/n. 

S.15. By virtue of Problem 5.13 (1), the a posteriori density is 
«-(S|jc) — 8' "~'(1 - 9)*' 1 "'- , " 1 ,and the average loss with respect to this dis- 
tribution is 

I (rf-fl)***"- '(I -g) i,t — > -'rf9 = fi f rf - — * ■ * " — ) + ci. 

V n + a + b/ 
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It follow* that 5*00 = ■ We calculate the risk function 

n + a + b 

\n + o + ft / \n + a + bj \n + a + b / 

_ {a ~ 6(a + ft)) 1 + ng(l - fl) 
{« + a + ft) 1 

The condition /?(0, fi*) ■ const is met for a = b = Vfl/2. Consequently, I he 
minimax solution is 



iW - — 



Tfii/2 



and its risk is 

m{5) = «(0. « = 



4(1 + &) 2 

5.16. We write the average loss for the decision n<x) = d with respect to 
(he a posteriori distribution of ir((f|.v) in the form 

E((B - tf) 2 \x) = Ef(0 - «•(*) + 5*ix) - d) 2 \x} 

= D(e|*) + (S*(*) - rf) 3 ? D(0|jr). 

The equality only holds for d = 6*(jc>, and therefore S*(x) is the sought -for 
decision with the conditional (for X = x) risk D(9Jx), Then the Bayes risk has 
the indicated form. 

In Problem S.15 the a posteriori mean of the parameter is equal to the 
first moment of the distribution B(a + x, b + n - x), i.e., 

fi*0) ■ E(8\x) = 



n + a + b 

5.17. Here M(X) = Bi{r, 8). By Problem 5.13 (2) the a posteriori distribu- 
tion is j?'(8\x) = B(a + x, b + r). Using the formulas for the moments of 
the beta distribution (see the introduction to Chap. 1), we find from Problem 
5 .16 that the sought- for Bayes estimate is 

5*{x) = E(e|x) = ■ 



a + b + r + X 
S.1S. By Problem 5.13 (3) the a posteriori distribution is 

S{9\%) = r I — - — , X + x J . * = (x ( , . . ., *„), x m / ,x t . 
\/n> +1 / *— ■ 
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Using the formulas For the moments of the gamma distribution (see the in- 
troduction lo Chap. 1), we find from Problem 5.16 that 

na +1 («a + l) 1 

Since j% (JO = n(/jS) (see Problem 1.39 (4)), the formula for the total expecta- 
tion gives EX = E(E B (JO) =■ E(nfl> = tia\ and, therefore, 

a 1 Xa 1 
r(fi*) = ED(»)JO = ; 0- + E*} ■ 

{na + if na + I 

Xo 1 
We minimize the quantity — — — + en in n and obtain the optimal number 
no + 1 

of observations 



-(r-i 



Remark. The number n* must be a non-negative integer. If we obtain a 
negative number, we set »* = (i-e., the observations are not needed). In other 
cases we take the smallcsl integer which is greater than or equal to the value 
obtained for the required observation number. This is true for all similar 
problems. 

5.20. By Problem 5.13 (4) the value d" of the Bayes estimate for X = * 
is found by minimizing 



KM «* = j^) J (-- =ii)V«- e- 



dB 



in o\ We solve — E(L(8, </)|j[) = with respect lo d and find the required 
dd 

expression for S*. 

Since Ji ( 3 Ai 1 = r(l/H *) (see Problem 1.39 (2)), we have 



'(£*)- 



_ ,. 8 + an 

E«i* = — . D 8 5* 



«ff(X + n - 1>" " e*(X + 11-I)* 

The risk function is 

R(8, «*) = E)|S* -M = D,5» + (e,S* -~\ 

= o*n + (9 - a<X - I)) 3 
tfVcx + /» - I) 1 

7a 17— 88'* 
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Finally, r(S*) a ER{8, $*), and we use the formulas for the moments of the 
gamma distribution to find the required result. The number n* is found by 
minimising r(S*) + en in a. 

5.21. The mean and variance of Pareto's distribution with the parameters 

a and a > 2 are and _ respectively. Whence (see 

a - 1 (or - l) J (a -2) 

Problem 5.13 (5)) we find the required (5* using Problem 5.16. Besides, 



Dp 1 1) ^4^ — ' < max ** *<">» 3 - 

(n + a - l) z (« + or - 2) 



In order to find the risk r(6+) ■ ED(0|X) it is sufficient to calculate 

E(max (a, x w )) 2 = E<E«(max (a, Am)) 2 ). 
We write 

(max (a, *■<„,))* = «*/Mf,., £ a) + J&t#t*<« > «)• 

where /(^) is the Indicator of the event A. Since the distribution density Xw 
for the given is m 1 "' 70\ ^ f < (see Problem US), we have 

E,(max (a, J^,,)) 1 - — I f" " » dl + - 1 f + ' tff 
0" J *" J 



" -0* + 2a " 



n + 2 (rt + 2)0" 

We then find 

E0* = D0 -ME9) 1 = ^-. E0-" = aO » [ d * = « 

or - 2 J (« + «■" („ + a ) fl 

whence 



E(E„(max (a, *<„,))») 



, ao'(n + <v - 2) 



<n + a)(a - 2) 
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The risk of the Bayes estimate is 

our* 



r(&V 



ta - 2)0i + a - i)*' 

"We minimize j-(6*> + en in n and find the optimal number of observations 

/ 2aa* V" 
" = I — ; — =- 1 -« + 1- 

5.23. We are seeking the value of d which minimizes the expectation 

n 

E(i(9, d)|h) = 2 E((0, - Aflh) 

1*3 

JV A* 

= 2 wftiw + 2 («<ftii».) - «*> 2 . 

The minimum is attained for */, = E(ft|b), i =» I, , , „ JV, and is equal to 

2 D(fl,|h). Mere (see Problem 5.13 (6) and the hint) 
f- i 

« + » (a + n) s (« 4- n + 1> 

The first formula defines the Bayes estimate 5j(h)„ and the second one allows 

IV 

us to calculate the respective risk r(&*) = 2 ED(&|h). We should also use 
the formulas (see Problem 1.52 and the hint) 

EAj = E(E»A/) = EnS; = nott/ct, 

Ehiih, - I) = E(E.fci(fli - I)) = En(n - 1)6?= — - . 

«(« + 1) 
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5.24. We use the notations of Problem 5.13 (7) and find by Problem 5.16 
that the sought-for estimate is 



6*(x) = E(0|x) = in 



-*<*♦*)■ 



Here D(e|jtJ = a{ m const, and the risk is r(S*) = {tr~ x + ufr "*)"'. We 
minimise the tola! loss f (**) + en in n and find the optima! sample size 



S.2S. (1) Lei ft > rf* Then 



Jrf* - c? 



for <J J? tf, 
|» - rf| - |» - ri*| = { rf + rf* - 20 for tT <$ <d. 

for s£d*. 



Since rf + d* - 20 > rf* - d for rf* < e <; d, we have 

E((Jfl - rf| - (0 - d*\)\x) > (rf« - rf)P{* & d|jr) 

+ (if* - t/)P(tf* < fl < rf| *) + (d - rf*>P(0 sj rf*|jc) 
= id ~ rf*)(P(0 =£ d*\x) ~ p(fl > rf*|*}). 

Since d* is a median, ihe last difference is greater than or equal lo zero. The 
case of d < d* is treated in a similar way. Thus, the decision d w minimizes 
the average conditional (for X = x) loss and is a Bayes solution. 

(2) By Problem 5,13 (7) the point d* - m Is the median of the a posteriori 
distribution S'ifi[*b. The Bayes estimate constructed from the sample X = 
ix,, . . . , Jf«) has the form 



e*(x) 

and its risk is 



<<**$)• 



r(6*> = E|0 - **(X)| = E(E(|8 - 5*(X)||X)>. 
The conditional {for X = x) distribution of the random variable 8 — b* (X) is 



E|K| = I- a,. 



/'(O, al), and for _zr*(V) = ^(0, (fJ) we have E| Y\ = f— «,. Therefore, 

(a" 1 + nfi" 1 )-'-' 2 . 



r(S*» « J| ( 



If the price or one observation is c > 0. we minimize the total loss /■(*■) + en 
in n to find the optimal number of observations 
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TO CHAPTER 6 

6.1. The representation 

i 

it n 

DX = — 7,cov(Jf*, X,) = -i y jJtjt-, 



-i[* + *S ('-$*] 



is true for the variance A\_L) rider the condition (6.1) it implies that 
DX~ — 0(l/rt) as n -* <o, it., X is a consistent estimate. 

6.3. We can find the expression 

ft - k n " 

EC t (n) - R K - ' 2j)j(*.-. + «.♦*-,) + -= / lft-i 

n(n — k) * » 

(-1,-1 ',i-i 

for the mean ECi(n). where under condition (6.1) the term added to R* is 
of the order 0(l/n) as n — «». 

6.4. Here E-V, = and 

cov (X*.i, X,) = EiXk*,. X,) 

= o 1 (cos \(k + t) eos \t + sin M* + f) sin ^f) = a 1 cos >i*. 

6.5. Here EX, = m 2 a i and 



cov (AT* ♦ „ Jf,) = S oriotj cov (& „ , _ ,, E, _ j) 
..j-o 

r- i*J 

I t 1 S a J a J + |*| for 1*1 *> r > 

y = t» 
for )*| > r 

because cov (ft* ,-,, E.-y) = a 1 for > - j = fc, while it is zero for i - j p* fc. 
6 J. The coefficients of the optimal predictor are found from 

(J,'™ = 2 K tf Rj-i. wh=re 1*1 = \R'-A~'- <<>J = °. _1 -")■ For 

J. -n 

<7 2 («) the representation ^(n) = |R(n + 2)|/lR(n + l)| is true (see [7, p. 263]) 
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with (he matrix 



R(/i + 1) 



RoXt A> 

£,/?« fl,-i 



ftrrlift 



.Jto 



6.9, Since [MOl = JP(/(!)|', it is sufficient to verify that |pn$ + 1)| * 
lPu(')l(Pi;(l>l- The matrix \py( 1)| is twice stochastic and, therefore, the station- 
ary distribution is uniform. 

6.10. Let U, (Jo, t/i, . . be a sequence of independent random variables 
uniformly distributed on the segment [0, I]. We find the random variables 



!»{U) 



! for V < 1/2. f<»W) 
2 for V Ss 1/2, i^(U) 



(\ for U 

\l for 

fl for U< 

{2 for l/S 



< I - a, 

a, 

a. 



Then the realization of the Markov chain is defined by 
*» = jo<«.), ,. - f 1 "'" "{[/,), / ^ 1- 
6.11. We have Et?, = and 

Rt = Etjti-ttji = P(v**( = ij) - P(,ttt*< & vi) 

= X - ii>>,m + pziik) - puffs) ~ t*i(#8 = - 2o)*, 

6.13. Here Etj, = E(E (£„(/) [*■,)) => because Efc(r) = 0. Then (see 
Problem 6.9) we have 

E w .„,, = E(E(J.,.,<* + »>&i<0!»*+<. *■')) 

= P( Pt + , . * = 1)4'' + *(»■*., « »- 2)*i 2 > 

= i (1 + (I _ 2«)*)(M ,J + JJpX 

4 



Appendix 

1. Normal Distribution 
Quanittes p = I c~"' y2 dx 
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2, Pols son's Distribution 
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2. Pol 6 son's Distribution {cont.) 
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4. x'W- Distribution 



Quantiles p = \ fr„(*) dx T I ■ 

n 



0.3 0.5 0.7 0.9 0.95 0.990 0.999 
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5. Student's Distribution S(«) 
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Notes. (I) The values of F , ,,„,„ are in the fust rows, she values of Hjmm are in the second rows. (2} Here m are the degrees 
of freedom for the greater variance, nj are the degrees of freedom for the smaller variance. 
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7. Kolmogorov's Test 
Values of the function \,: p = P(D, = sup \F„(x) - F(x)\ > Xp) 
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8. Smlmov's Test 

The probabilities P(D M $ k/n), where A,, = sup \F i„(jt) - jFi„(*)| 
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1.000 


1.000 






10 


006 


213 


582 


832 


948 


988 


998 


1.000 


1.000 


1.000 




11 


003 


167 


521 


789 


925 


979 


996 


0.999 


1.000 


1.000 


1.900 


12 


002 


131 


464 


744 


900 


969 


992 


999 


1.000 


1.000 


1.000 


13 


001 


102 


412 


700 


874 


956 


987 


997 


1.000 


1.000 


1.000 


14 


000 


079 


365 


657 


|4J 


941 


981 


995 


0.999 


1-000 


1.000 


15 


000 


062 


322 


614 


316 


925 


974 


992 


998 


1.000 


1.000 


16 


000 


048 


284 


574 


785 


907 


965 


989 


997 


1.000 


1.000 


17 


000 


037 


249 


535 


755 


8SS 


955 


984 


995 


0,999 


0.999 


IS 


000 


028 


219 


497 


725 


868 


944 


979 


993 


998 


999 


19 


000 


022 


192 


462 


694 


847 


932 


973 


991 


997 


999 


20 


000 


017 


168 


429 


664 


825 


919 


966 


988 


996 


999 


21 


000 


013 


147 


397 


635 


804 


905 


959 


984 


995 


998 


22 


ooo 


010 


128 


368 


606 


782 


891 


951 


980 


993 


998 


n 


000 


oos 


112 


340 


578 


759 


876 


942 


975 


991 


997 


24 


000 


006 


098 


314 


551 


737 


860 


932 


970 


988 


996 
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fl. Random Numbers Uniformly Distributed on [0, 1] 



0.5916 


3406 


6079 


4101 


5314 


6362 


7463 


3203 


1643 


5825 


3127 


1413 


9711 


6253 


4135 


0640 


0120 


3993 


3136 


3821 


3617 


6700 


5940 


9629 


L094 


712a 


6396 


6787 


3147 


2625 


6635 


5477 


9121 


4513 


6213 


9162 


3901 


7480 


6319 


2645 


9313 


5889 


0399 


2226 


7919 


8216 


S851 


4184 


0471 


9664 


9470 


2099 


1992 


0836 


5050 


3361 


6387 


3374 


4963 


1255 


5303 


5501 


4237 


5307 


8954 


1039 


9430 


6838 


4188 


2383 


9031 


4215 


1197 


8764 


8382 


9481 


4474 


83 IS 


1752 


8546 


8922 


6145 


5759 


5489 


1479 


5725 


6542 


2141 


7449 


1653 


4398 


7198 


5643 


3687 


2311 


3652 


5889 


S865 


2378 


2198 


9612 


044S 


9632 


3741 


4776 


6836 


0101 


S861 


2786 


5132 


4601 


8247 


6883 


2196 


6S70 


9154 


7397 


3584 


2139 


1019 


2212 


S036 


6484 


9953 


8382 


7138 


2036 


5270 


7441 


4387 


9192 


9019 


7880 


4728 


0115 


3072 


2267 


6512 


3673 


2943 


2380 


4955 


7803 


1907 


5803 


3290 


SS62 


2358 


3986 


1904 


4448 


1790 


1932 


0833 


7005 


7042 


4161 


9279 


4049 


1693 


5978 


5412 


2134 


9202 


7586 


7147 


7403 


S033 


8549 


6005 


4386 


9362 


6122 


0193 


1987 
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10. Random Numbers Normally Distributed as ^{0, 1} 



- 0.486 


0.856 


- 0.491 


- 1.983 


-1.787 


- 0.261 


-0.256 


-0.212 


0.219 


0.779 


-0.105 


-0.357 


0.065 


0.415 


-0.169 


0.313 


-1.339 


1.827 


1.147 


- 0.121 


1.096 


0.181 


1.041 


0.535 


-0.199 


-0.246 


1.239 


- 2.574 


0.279 


- 2.056 


1.237 


1.046 


-0.508 


- 1.630 


-0.146 


-0.392 


-1.384 


0,360 


- 0,992 


-0.116 


-1.698 


- 2.-832 


-0.959 


0.424 


0.969 


-1.141 


-1.041 


0.362 


0.731 


1.377 


0.983 


-1.330 


1.620 


-1.040 


0.717 


- 0.873 


-I.C96 


-1.396 


1.047 


0.089 


-1.805 


-2.008 


-1.633 


0.542 


0.250 


-0.166 


-1.186 


1.180 


1,114 


0.882 


1.265 


-0.202 


0.658 


- 1.141 


1.151 


-1.210 


-0.927 


0.425 


-0.439 


0.358 


-1.939 


0.891 


-0.227 


0.602 


-1.399 


-0.230 


0.385 


-0.649 


-0.577 


0.237 


0.032 


0.079 


0.199 


0.208 


-1.083 


- 0.219 


0.151 


-0.376 


0.159 


0,372 


-0.313 


0.084 


0,290 


- 0.902 


2.273 


0.606 


0.606 


- 0.747 


0.873 


-0.437 


0.041 


-0.307 


0.121 


0.790 


-0.289 


0.513 


- 1.132 


-2.098 


0.921 


0.145 


- 0.291 


1.221 


1.119 


0.004 


0.768 


0.079 


- 2.828 


- 0.439 


- 0.792 


- 1.275 


0.375 


- 1.656 


0.247 


1.291 


0.063 


-1.793 


-0.513 


-0.344 


-0.584 


0.541 


0.484 


-0.986 


0.292 


- 0.521 


0.446 


-1.661 


1.045 


-1.363 


1.026 


2.990 


0.034 


- 2.127 


0.665 


0.084 


- 0.880 


-1.473 


0.234 


-0.656 


0.340 


-0.086 


-0.158 


-0.851 


-0.736 


1.041 


0.008 


0.427 


-0.831 


0.210 


-1.206 


- 0.899 


0.110 


-0.528 


-0.813 


1.266 


-0.491 


-1.114 


1.297 


-1.433 


-1.345 


-0.574 


- 1.334 


1.278 


- 0.568 


-0.109 


- 0.515 


-0.566 


- 0.287 


-0.144 


- 0.254 


0.574 


- 0.451 


-1.181 


0.161 


-0.886 


- 0.921 


-0.509 


1.410 


-0.518 


-1.346 


0.193 


-1.202 


0.394 


-1.045 


0.843 


1.2 SO 


-0.199 


-0.288 


1.810 


1.378 


0.584 



276 Appendix 



10. Random Numbers Normally Distributed as ^<0, 1) (cont.) 



2.923 


0.50O 


0.630 


-0.537 


0.782 


0.060 


-L190 


- 0.318 


0.375 


- 1.941 


Q.247 


-0.491 


0J92 


-0.432 


-1.420 


0.489 


-1.711 


-1.186 


0,942 


1.04S 


-0.151 


-0.243 


- 0.430 


-0.762 


1.216 


0.733 


-0.309 


0.531 


0.416 


- 1.541 


0.499 


-0.431 


1,705 


J. 164 


0.424 


-0.444 


0.665 


-0,135 


-0.145 


-0.498 


0.393 


0.658 


0.754 


-0.732 


-0.066 


1,006 


0.862 


- 0.885 


0.298 


1.049 


1.B10 


2.SS5 


0.235 


-0.628 


1.456 


2.040 


-0.124 


0.196 


-0.853 


0.402 


0.593 


0.993 


-0.106 


0.116 


0.484 


- 1.272 


-1.127 


-1.407 


-1.579 


-1.616 


1.458 


1.262 


-0.142 


-0.504 


0.532 


1.381 


0.022 


-0.281 


-0.023 


-0.463 


-0.899 


- 0.394 


-OS38 


1.707 


0.777 


0,833 


0.410 


-0.349 


-1.094 


0.580 


0.241 


-0.957 


-I.BB5 


0.371 


-2.830 


-0.238 


0.022 


0.525 


-0.255 


- 0.702 


0.953 


- 0.869 


-0.853 


- 1.865 


-0.423 


- 0.973 


- 1.016 


-1.726 


-0.501 


-0.273 


0.857 


-0.465 


- 1.691 


0.417 


0.439 


-0.035 


-0.260 


0.120 


-9.558 


0.056 



List of Distributions 



1; 9, 10. 11. 27. 29, 32, 39, 41, 43, 46, 
47, 56, 57, 58, 60, 61, 2: 13, 14, 15, 
16. 17, IB, 19, 20. 43, 48, SO, 52, 53, 
64, 65, 66, 67. 68. 70, 71. 72, 84, 85, 
86, 87, 88, 89, 94, 106, 109, 110, 114, 
115, 116, 117, 118, 119, 120, 121, 122, 
123, 124. 125, 126, 131, 142, 143, 145, 
146. 148, 3: 35, 45, 47, 48, 49, 50, 58, 
59, 60, 61, 65, 66, 67, 73, 74, 79. 80. 
84, B7. 4: 8, 9, 10, 11, 13, 14, 15, 19, 
20, 21, 22, 23, 24, 25, 27, 28, 29, 34, 
36. 5: 7, 8, 13, 24, 25. 

2. Normal multivariate 1: 28. 40. 53, 59, 62, 63. 2; 39, 44, 90. 
,|-(jt, E) 91, 92, 93. 132, 150. 3: 52. 72. 88. 

5: 11, 12. 

3. Binomial Bi'Oi. p) 1= 1. X B. ™, W, »5, 16, 1?. '«. 39, 

52. 54. 2; 5, 6, 7, 8, 43, 48, 57, 84, 
109, 110. 133, 134, 135. 136, 148, 
3: 1, 2, S. 17, 18, 39, 40, 46, 53, 63, 
75, 77. S 1. 2, 3, 4, 13. 14, 15, 16. 

4. Polynomial »■ 3, 19, 20, 26, 39, 52, 53, 54. fc 29, 
Attn- p^ . .., piv) 38, 45. 63. 144, 3: 3, 4, 7, 8, 9, 12, 

13, 20. 24, 34, 37, 38. 5: 13. 23. 

5 Pulsion's IIW 1: 39, 54, 55, 60. 64. 2: 9, 10, 31, 43, 

48, 58. 59, 61, S4, 97, 108, 109, 137, 
138. 139. 3: 14, 15, 16, 41, 54, 64, 76. 
78. 5: 6. II, 13, 18, 19. 

6 Binomial negative Wi{r, p) 1: 39, 55. la U, 12, 43, 48, 62, 84, 96, 

140. 3: 42. 55. 5: 5, 13, 17. 

7. Gamma T(a, X) 1: 6. 7, 8. 21, 34. 39, 42, 44, 51. 55. 

2: 21, 22, 30, 43, 48, 51, 73, 74, 75, 
84, 104, 109, 127, 128. 130, 141.3: 10. 
11, 43. 56. 62, 68, 69. 5: 13, 18, 20. 
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List of Distributions 



8, Uniform R(a, b) 



9, Weibull's W(a. a, ft) 

10. Cauchy's C(a) 

11. Hypergeo metric H(r> N, n) 

12. Chi -square x*( n ) 

13. Beta B{a. b) 

14. Student's S(n) 

15. Snedecor's S(m, ni) 

16. Logistic 

17. Power series 

18. Pareto's II(a, a) 

19. Dirichkl's £>(a) 

20. Laplace 

21. Kapteyn's 

22. Gaussian inverse 

23. Finite papulation 



1: 5, 19, 3S, 36, 43, 46. 2: 23, 24. 25, 
32, 79, 80, SI, 100, 101, 107, 110, 129, 
148. 3: 25, 36, 45. 70. 4: 7, 12, 33, 35. 
5: 13, 21, 22. 

1:37.2:26.76,77. 102,103.107.130. 
3: 71. 

1: 46. 2: 28, 43, 99. 3i 44. 

1: 33, 113. 3: 57. 

t: 40, 45, 47, 51, 57, 59, 147. 3: 19. 

1: 44, 47, 48, 49. 2: 48. 5: 13, 17. 

V. 47, 50, 59, 

1: 48, 49, 50, 51. 

2: 27, 49. 

2: 60. 96, 140. 

S-. 13. 21. 

5: 13, 23. 

2: 105. 

2: 95. 

I: 54. 

2: 34, 35, 36, 37, 83, 111, 112. 



Bibliography 



1. Bickel P. J. and Dofcsum K, A, Mathematical Statistics. Basic Ideas and 
Selected Topics. Holden-Day, San Francisco, etc, 1977, 

2. Chistyakov V. P. A Course in Probability Theory. Nauka, Moscow, 1987 
(in Russian). 

3. Cramer H, Mathematical Methods of Statistics. Princeton, 1946. 

4. Ermakov S. M. and Mikhailov G. A. Statistical Simulation. Nauka, 
Moscow, 1982 (in Russian). 

J. Feller W. An Introduction to Probability Theory and Its Applications. 
Vol. 1, Third Edition, J. Wiley &. Sons, New York, 1968. 

6. Ivchenko G. I., Glibochenko A. F„ Ivanov V. A., and Medvedcv Yu. 1. 
Statistical Analysis of Discrete Random Sequences, Moscow Institute of Elec- 
tronic- Engineering, Moscow, 1984 (in Russian). 

7. Ivchenko G. 1. and Medvedev Yu. I. Mathematical Statistics. Mir, 
Moscow, 1990. 

B. Kendall M. G. and Stuari A. The Advanced Theory of Statistics. Vol. 1. 
Distribution Theory. Third Edition, Griffin, London, 1969. 

9. Knuth D. E. 77ic Art of Computer Programming. Vol. 2, Second Edi- 
tion, Eddison Wesley, Reading (Mass.), cw., 1981. 

10. Lehman E. L. Testing Statistical Hypotheses. J, Wiley & Sons, New 
York, 1959. 



To the Reader 

Mir Publishers would be grateful for your comments on 
the content, translation and design of this book. We would 
also be pleased to receive any other suggestions you may 
wish to make. 

Our address is: 

Mir Publishers 

2 Pervy Riahsky Pereulok 

1-110, GSP, Moscow, 129820 

USSR 



